3.1 Arithmetics and Error Measures3.3 Interpolation

§3.2 Linear Algebra

Contents

§3.2(i) Gaussian Elimination

To solve the system

3.2.1\mathbf{A}\mathbf{x}=\mathbf{b},

with Gaussian elimination, where \mathbf{A} is a nonsingular n\times n matrix and \mathbf{b} is an n\times 1 vector, we start with the augmented matrix

3.2.2\begin{bmatrix}a_{{11}}&\cdots&a_{{1n}}&b_{1}\\
\vdots&\ddots&\vdots&\vdots\\
a_{{n1}}&\cdots&a_{{nn}}&b_{n}\end{bmatrix}.

By repeatedly subtracting multiples of each row from the subsequent rows we obtain a matrix of the form

3.2.3\begin{bmatrix}u_{{11}}&u_{{12}}&\cdots&u_{{1n}}&y_{1}\\
0&u_{{22}}&\cdots&u_{{2n}}&y_{2}\\
\vdots&\ddots&\ddots&\vdots&\vdots\\
0&\cdots&0&u_{{nn}}&y_{n}\end{bmatrix}.

During this reduction process we store the multipliers \ell _{{jk}} that are used in each column to eliminate other elements in that column. This yields a lower triangular matrix of the form

3.2.4\mathbf{L}=\begin{bmatrix}1&0&\cdots&0\\
\ell _{{21}}&1&\cdots&0\\
\vdots&\ddots&\ddots&\vdots\\
\ell _{{n1}}&\cdots&\ell _{{n,n-1}}&1\end{bmatrix}.

If we denote by \mathbf{U} the upper triangular matrix comprising the elements u_{{jk}} in (3.2.3), then we have the factorization, or triangular decomposition,

3.2.5\mathbf{A}=\mathbf{L}\mathbf{U}.

With \mathbf{y}=[y_{1},y_{2},\dots,y_{n}]^{{\rm T}} the process of solution can then be regarded as first solving the equation \mathbf{L}\mathbf{y}=\mathbf{b} for \mathbf{y} (forward elimination), followed by the solution of \mathbf{U}\mathbf{x}=\mathbf{y} for \mathbf{x} (back substitution).

For more details see Golub and Van Loan (1996, pp. 87–100).

Example

3.2.6\begin{bmatrix}1&2&3\\
2&3&1\\
3&1&2\end{bmatrix}=\begin{bmatrix}1&0&0\\
2&1&0\\
3&5&1\end{bmatrix}\begin{bmatrix}1&2&3\\
0&-1&-5\\
0&0&18\end{bmatrix}.

In solving \mathbf{A}\mathbf{x}=[1,1,1]^{{\rm T}}, we obtain by forward elimination \mathbf{y}=[1,-1,3]^{{\rm T}}, and by back substitution \mathbf{x}=[\frac{1}{6},\frac{1}{6},\frac{1}{6}]^{{\rm T}}.

In practice, if any of the multipliers \ell _{{jk}} are unduly large in magnitude compared with unity, then Gaussian elimination is unstable. To avoid instability the rows are interchanged at each elimination step in such a way that the absolute value of the element that is used as a divisor, the pivot element, is not less than that of the other available elements in its column. Then \left|\ell _{{jk}}\right|\leq 1 in all cases. This modification is called Gaussian elimination with partial pivoting.

For more information on pivoting see Golub and Van Loan (1996, pp. 109–123).

Iterative Refinement

When the factorization (3.2.5) is available, the accuracy of the computed solution \mathbf{x} can be improved with little extra computation. Because of rounding errors, the residual vector \mathbf{r}=\mathbf{b}-\mathbf{A}\mathbf{x} is nonzero as a rule. We solve the system \mathbf{A}\delta\mathbf{x}=\mathbf{r} for \delta\mathbf{x}, taking advantage of the existing triangular decomposition of \mathbf{A} to obtain an improved solution \mathbf{x}+\delta\mathbf{x}.

§3.2(ii) Gaussian Elimination for a Tridiagonal Matrix

Tridiagonal matrices are ones in which the only nonzero elements occur on the main diagonal and two adjacent diagonals. Thus

3.2.7\mathbf{A}=\begin{bmatrix}b_{1}&c_{1}&&&0\\
a_{2}&b_{2}&c_{2}&&\\
&\ddots&\ddots&\ddots&\\
&&a_{{n-1}}&b_{{n-1}}&c_{{n-1}}\\
0&&&a_{{n}}&b_{n}\end{bmatrix}.

Assume that \mathbf{A} can be factored as in (3.2.5), but without partial pivoting. Then

3.2.8\mathbf{L}=\begin{bmatrix}1&0&&&0\\
\ell _{2}&1&0&&\\
&\ddots&\ddots&\ddots&\\
&&\ell _{{n-1}}&1&0\\
0&&&\ell _{{n}}&1\end{bmatrix},
3.2.9\mathbf{U}=\begin{bmatrix}d_{1}&u_{1}&&&0\\
0&d_{2}&u_{2}&&\\
&\ddots&\ddots&\ddots&\\
&&0&d_{{n-1}}&u_{{n-1}}\\
0&&&0&d_{n}\end{bmatrix},

where u_{j}=c_{j}, j=1,2,\dots,n-1, d_{1}=b_{1}, and

3.2.10
\ell _{j}=a_{j}/d_{{j-1}},
d_{j}=b_{j}-\ell _{j}c_{{j-1}},j=2,\dots,n.

Forward elimination for solving \mathbf{A}\mathbf{x}=\mathbf{f} then becomes y_{1}=f_{1},

3.2.11y_{j}=f_{j}-\ell _{j}y_{{j-1}},j=2,\dots,n,

and back substitution is x_{n}=y_{n}/d_{n}, followed by

3.2.12x_{j}=(y_{j}-u_{j}x_{{j+1}})/d_{j},j=n-1,\dots,1.

For more information on solving tridiagonal systems see Golub and Van Loan (1996, pp. 152–160).

§3.2(iii) Condition of Linear Systems

The p-norm of a vector \mathbf{x}=[x_{1},\dots,x_{n}]^{{\rm T}} is given by

3.2.13
\|\mathbf{x}\| _{p}=\left(\sum _{{j=1}}^{n}\left|x_{j}\right|^{p}\right)^{{1/p}},p=1,2,\dots,
\|\mathbf{x}\| _{\infty}=\max _{{1\leq j\leq n}}\left|x_{j}\right|.

The Euclidean norm is the case p=2.

The p-norm of a matrix \mathbf{A}=[a_{{jk}}] is

3.2.14\|\mathbf{A}\| _{p}=\max _{{\mathbf{x}\neq\boldsymbol{{0}}}}\frac{\|\mathbf{A}\mathbf{x}\| _{p}}{\|\mathbf{x}\| _{p}}\,.

The cases p=1,2, and \infty are the most important:

3.2.15
\|\mathbf{A}\| _{1}=\max _{{1\leq k\leq n}}\sum _{{j=1}}^{n}\left|a_{{jk}}\right|,
\|\mathbf{A}\| _{\infty}=\max _{{1\leq j\leq n}}\sum _{{k=1}}^{n}\left|a_{{jk}}\right|,
\|\mathbf{A}\| _{2}=\sqrt{\rho(\mathbf{A}\mathbf{A}^{{\rm T}})},

where \rho(\mathbf{A}\mathbf{A}^{{\rm T}}) is the largest of the absolute values of the eigenvalues of the matrix \mathbf{A}\mathbf{A}^{{\rm T}}; see §3.2(iv). (We are assuming that the matrix \mathbf{A} is real; if not \mathbf{A}^{{\rm T}} is replaced by \mathbf{A}^{{\rm H}}, the transpose of the complex conjugate of \mathbf{A}.)

The sensitivity of the solution vector \mathbf{x} in (3.2.1) to small perturbations in the matrix \mathbf{A} and the vector \mathbf{b} is measured by the condition number

3.2.16\kappa(\mathbf{A})=\|\mathbf{A}\| _{p}\;\|\mathbf{A}^{{-1}}\| _{p},

where \|\,\cdot\,\| _{p} is one of the matrix norms. For any norm (3.2.14) we have \kappa(\mathbf{A})\geq 1. The larger the value \kappa(\mathbf{A}), the more ill-conditioned the system.

Let \mathbf{x}^{{*}} denote a computed solution of the system (3.2.1), with \mathbf{r}=\mathbf{b}-\mathbf{A}\mathbf{x}^{{*}} again denoting the residual. Then we have the a posteriori error bound

3.2.17\frac{\|\mathbf{x}^{{*}}-\mathbf{x}\| _{p}}{\|\mathbf{x}\| _{p}}\leq\kappa(\mathbf{A})\frac{\|\mathbf{r}\| _{p}}{\|\mathbf{b}\| _{p}}.

For further information see Brezinski (1999) and Trefethen and Bau (1997, Chapter 3).

§3.2(iv) Eigenvalues and Eigenvectors

If \mathbf{A} is an n\times n matrix, then a real or complex number \lambda is called an eigenvalue of \mathbf{A}, and a nonzero vector \mathbf{x} a corresponding (right) eigenvector, if

3.2.18\mathbf{A}\mathbf{x}=\lambda\mathbf{x}.

A nonzero vector \mathbf{y} is called a left eigenvector of \mathbf{A} corresponding to the eigenvalue \lambda if \mathbf{y}^{{\rm T}}\mathbf{A}=\lambda\mathbf{y}^{{\rm T}} or, equivalently, \mathbf{A}^{{\rm T}}\mathbf{y}=\lambda\mathbf{y}. A normalized eigenvector has Euclidean norm 1; compare (3.2.13) with p=2.

The polynomial

3.2.19p_{n}(\lambda)=\det[\lambda\mathbf{I}-\mathbf{A}]

is called the characteristic polynomial of \mathbf{A} and its zeros are the eigenvalues of \mathbf{A}. The multiplicity of an eigenvalue is its multiplicity as a zero of the characteristic polynomial (§3.8(i)). To an eigenvalue of multiplicity m, there correspond m linearly independent eigenvectors provided that \mathbf{A} is nondefective, that is, \mathbf{A} has a complete set of n linearly independent eigenvectors.

§3.2(v) Condition of Eigenvalues

If \mathbf{A} is nondefective and \lambda is a simple zero of p_{n}(\lambda), then the sensitivity of \lambda to small perturbations in the matrix \mathbf{A} is measured by the condition number

3.2.20\kappa(\lambda)=\frac{1}{\left|\mathbf{y}^{{\rm T}}\mathbf{x}\right|},

where \mathbf{x} and \mathbf{y} are the normalized right and left eigenvectors of \mathbf{A} corresponding to the eigenvalue \lambda. Because \left|\mathbf{y}^{{\rm T}}\mathbf{x}\right|=\left|\mathop{\cos\/}\nolimits\theta\right|, where \theta is the angle between \mathbf{y}^{{\rm T}} and \mathbf{x} we always have \kappa(\lambda)\geq 1. When \mathbf{A} is a symmetric matrix, the left and right eigenvectors coincide, yielding \kappa(\lambda)=1, and the calculation of its eigenvalues is a well-conditioned problem.

§3.2(vi) Lanczos Tridiagonalization of a Symmetric Matrix

Define the Lanczos vectors \mathbf{v}_{j} by \mathbf{v}_{0}=\boldsymbol{{0}}, a normalized vector \mathbf{v}_{1} (perhaps chosen randomly), and for j=1,2,\dots,n-1,

3.2.21
\beta _{{j+1}}\mathbf{v}_{{j+1}}=\mathbf{A}\mathbf{v}_{j}-\alpha _{j}\mathbf{v}_{j}-\beta _{j}\mathbf{v}_{{j-1}},
\alpha _{j}=\mathbf{v}_{j}^{{\rm T}}\mathbf{A}\mathbf{v}_{j},
\beta _{{j+1}}=\mathbf{v}_{{j+1}}^{{\rm T}}\mathbf{A}\mathbf{v}_{j}.

Then all \mathbf{v}_{j}, 1\leq j\leq n, are normalized and \mathbf{v}_{j}^{{\rm T}}\mathbf{v}_{k}=0 for j,k=1,2,\dots,n, j\neq k. The tridiagonal matrix

3.2.22\mathbf{B}=\begin{bmatrix}\alpha _{1}&\beta _{2}&&&0\\
\beta _{2}&\alpha _{2}&\beta _{3}&&\\
&\ddots&\ddots&\ddots&\\
&&\beta _{{n-1}}&\alpha _{{n-1}}&\beta _{{n}}\\
0&&&\beta _{{n}}&\alpha _{n}\end{bmatrix}

has the same eigenvalues as \mathbf{A}. Its characteristic polynomial can be obtained from the recursion

3.2.23p_{{k+1}}(\lambda)=(\lambda-\alpha _{{k+1}})p_{k}(\lambda)-\beta _{{k+1}}^{2}p_{{k-1}}(\lambda),k=0,1,\dots,n-1,

with p_{{-1}}(\lambda)=0, p_{0}(\lambda)=1.

For numerical information see Stewart (2001, pp. 347–368).

§3.2(vii) Computation of Eigenvalues

Many methods are available for computing eigenvalues; see Golub and Van Loan (1996, Chapters 7, 8), Trefethen and Bau (1997, Chapter 5), and Wilkinson (1988, Chapters 8, 9).