# §3.2 Linear Algebra

## §3.2(i) Gaussian Elimination

To solve the system

 3.2.1 $\mathbf{A}\mathbf{x}=\mathbf{b},$ ⓘ Referenced by: §3.2(iii), §3.2(iii) Permalink: http://dlmf.nist.gov/3.2.E1 Encodings: TeX, pMML, png See also: Annotations for 3.2(i), 3.2 and 3

with Gaussian elimination, where $\mathbf{A}$ is a nonsingular $n\times n$ matrix and $\mathbf{b}$ is an $n\times 1$ vector, we start with the augmented matrix

 3.2.2 $\begin{bmatrix}a_{11}&\cdots&a_{1n}&b_{1}\\ \vdots&\ddots&\vdots&\vdots\\ a_{n1}&\cdots&a_{nn}&b_{n}\end{bmatrix}.$ ⓘ Permalink: http://dlmf.nist.gov/3.2.E2 Encodings: TeX, pMML, png See also: Annotations for 3.2(i), 3.2 and 3

By repeatedly subtracting multiples of each row from the subsequent rows we obtain a matrix of the form

 3.2.3 $\begin{bmatrix}u_{11}&u_{12}&\cdots&u_{1n}&y_{1}\\ 0&u_{22}&\cdots&u_{2n}&y_{2}\\ \vdots&\ddots&\ddots&\vdots&\vdots\\ 0&\cdots&0&u_{nn}&y_{n}\end{bmatrix}.$ ⓘ Referenced by: §3.2(i) Permalink: http://dlmf.nist.gov/3.2.E3 Encodings: TeX, pMML, png See also: Annotations for 3.2(i), 3.2 and 3

During this reduction process we store the multipliers $\ell_{jk}$ that are used in each column to eliminate other elements in that column. This yields a lower triangular matrix of the form

 3.2.4 $\mathbf{L}=\begin{bmatrix}1&0&\cdots&0\\ \ell_{21}&1&\cdots&0\\ \vdots&\ddots&\ddots&\vdots\\ \ell_{n1}&\cdots&\ell_{n,n-1}&1\end{bmatrix}.$ ⓘ Symbols: $\ell_{jk}$: multipliers Permalink: http://dlmf.nist.gov/3.2.E4 Encodings: TeX, pMML, png See also: Annotations for 3.2(i), 3.2 and 3

If we denote by $\mathbf{U}$ the upper triangular matrix comprising the elements $u_{jk}$ in (3.2.3), then we have the factorization, or triangular decomposition,

 3.2.5 $\mathbf{A}=\mathbf{L}\mathbf{U}.$ ⓘ Referenced by: §3.2(i), §3.2(ii) Permalink: http://dlmf.nist.gov/3.2.E5 Encodings: TeX, pMML, png See also: Annotations for 3.2(i), 3.2 and 3

With $\mathbf{y}=[y_{1},y_{2},\dots,y_{n}]^{\rm T}$ the process of solution can then be regarded as first solving the equation $\mathbf{L}\mathbf{y}=\mathbf{b}$ for $\mathbf{y}$ (forward elimination), followed by the solution of $\mathbf{U}\mathbf{x}=\mathbf{y}$ for $\mathbf{x}$ (back substitution).

For more details see Golub and Van Loan (1996, pp. 87–100).

### Example

 3.2.6 $\begin{bmatrix}1&2&3\\ 2&3&1\\ 3&1&2\end{bmatrix}=\begin{bmatrix}1&0&0\\ 2&1&0\\ 3&5&1\end{bmatrix}\begin{bmatrix}1&2&3\\ 0&-1&-5\\ 0&0&18\end{bmatrix}.$ ⓘ Permalink: http://dlmf.nist.gov/3.2.E6 Encodings: TeX, pMML, png See also: Annotations for 3.2(i), 3.2(i), 3.2 and 3

In solving $\mathbf{A}\mathbf{x}=[1,1,1]^{\rm T}$, we obtain by forward elimination $\mathbf{y}=[1,-1,3]^{\rm T}$, and by back substitution $\mathbf{x}=[\frac{1}{6},\frac{1}{6},\frac{1}{6}]^{\rm T}$.

In practice, if any of the multipliers $\ell_{jk}$ are unduly large in magnitude compared with unity, then Gaussian elimination is unstable. To avoid instability the rows are interchanged at each elimination step in such a way that the absolute value of the element that is used as a divisor, the pivot element, is not less than that of the other available elements in its column. Then $\left|\ell_{jk}\right|\leq 1$ in all cases. This modification is called Gaussian elimination with partial pivoting.

For more information on pivoting see Golub and Van Loan (1996, pp. 109–123).

### Iterative Refinement

When the factorization (3.2.5) is available, the accuracy of the computed solution $\mathbf{x}$ can be improved with little extra computation. Because of rounding errors, the residual vector $\mathbf{r}=\mathbf{b}-\mathbf{A}\mathbf{x}$ is nonzero as a rule. We solve the system $\mathbf{A}\delta\mathbf{x}=\mathbf{r}$ for $\delta\mathbf{x}$, taking advantage of the existing triangular decomposition of $\mathbf{A}$ to obtain an improved solution $\mathbf{x}+\delta\mathbf{x}$.

## §3.2(ii) Gaussian Elimination for a Tridiagonal Matrix

Tridiagonal matrices are ones in which the only nonzero elements occur on the main diagonal and two adjacent diagonals. Thus

 3.2.7 $\mathbf{A}=\begin{bmatrix}b_{1}&c_{1}&&&0\\ a_{2}&b_{2}&c_{2}&&\\ &\ddots&\ddots&\ddots&\\ &&a_{n-1}&b_{n-1}&c_{n-1}\\ 0&&&a_{n}&b_{n}\end{bmatrix}.$ ⓘ Permalink: http://dlmf.nist.gov/3.2.E7 Encodings: TeX, pMML, png See also: Annotations for 3.2(ii), 3.2 and 3

Assume that $\mathbf{A}$ can be factored as in (3.2.5), but without partial pivoting. Then

 3.2.8 $\mathbf{L}=\begin{bmatrix}1&0&&&0\\ \ell_{2}&1&0&&\\ &\ddots&\ddots&\ddots&\\ &&\ell_{n-1}&1&0\\ 0&&&\ell_{n}&1\end{bmatrix},$ ⓘ Defines: $\ell_{jk}$: elements of $\mathbf{L}$ (locally) Permalink: http://dlmf.nist.gov/3.2.E8 Encodings: TeX, pMML, png See also: Annotations for 3.2(ii), 3.2 and 3
 3.2.9 $\mathbf{U}=\begin{bmatrix}d_{1}&u_{1}&&&0\\ 0&d_{2}&u_{2}&&\\ &\ddots&\ddots&\ddots&\\ &&0&d_{n-1}&u_{n-1}\\ 0&&&0&d_{n}\end{bmatrix},$ ⓘ Defines: $u_{j}$: elements (locally) Symbols: $d_{j}$: coefficient Permalink: http://dlmf.nist.gov/3.2.E9 Encodings: TeX, pMML, png See also: Annotations for 3.2(ii), 3.2 and 3

where $u_{j}=c_{j}$, $j=1,2,\dots,n-1$, $d_{1}=b_{1}$, and

 3.2.10 $\displaystyle\ell_{j}$ $\displaystyle=a_{j}/d_{j-1},$ $\displaystyle d_{j}$ $\displaystyle=b_{j}-\ell_{j}c_{j-1}$, $j=2,\dots,n$. ⓘ Defines: $d_{j}$: coefficient (locally) Symbols: $\ell_{jk}$: elements of $\mathbf{L}$ Permalink: http://dlmf.nist.gov/3.2.E10 Encodings: TeX, TeX, pMML, pMML, png, png See also: Annotations for 3.2(ii), 3.2 and 3

Forward elimination for solving $\mathbf{A}\mathbf{x}=\mathbf{f}$ then becomes $y_{1}=f_{1}$,

 3.2.11 $y_{j}=f_{j}-\ell_{j}y_{j-1},$ $j=2,\dots,n$, ⓘ Symbols: $\ell_{jk}$: elements of $\mathbf{L}$ and $f_{i}$: element Permalink: http://dlmf.nist.gov/3.2.E11 Encodings: TeX, pMML, png See also: Annotations for 3.2(ii), 3.2 and 3

and back substitution is $x_{n}=y_{n}/d_{n}$, followed by

 3.2.12 $x_{j}=(y_{j}-u_{j}x_{j+1})/d_{j},$ $j=n-1,\dots,1$. ⓘ Symbols: $u_{j}$: elements and $d_{j}$: coefficient Permalink: http://dlmf.nist.gov/3.2.E12 Encodings: TeX, pMML, png See also: Annotations for 3.2(ii), 3.2 and 3

For more information on solving tridiagonal systems see Golub and Van Loan (1996, pp. 152–160).

## §3.2(iii) Condition of Linear Systems

The $p$-norm of a vector $\mathbf{x}=[x_{1},\dots,x_{n}]^{\rm T}$ is given by

 3.2.13 $\displaystyle\|\mathbf{x}\|_{p}$ $\displaystyle=\left(\sum_{j=1}^{n}{\left|x_{j}\right|^{p}}\right)^{1/p},$ $p=1,2,\dots$, $\displaystyle\|\mathbf{x}\|_{\infty}$ $\displaystyle=\max_{1\leq j\leq n}\left|x_{j}\right|.$ ⓘ Referenced by: §3.2(iv) Permalink: http://dlmf.nist.gov/3.2.E13 Encodings: TeX, TeX, pMML, pMML, png, png See also: Annotations for 3.2(iii), 3.2 and 3

The Euclidean norm is the case $p=2$.

The $p$-norm of a matrix $\mathbf{A}=[a_{jk}]$ is

 3.2.14 $\|\mathbf{A}\|_{p}=\max_{\mathbf{x}\neq\boldsymbol{{0}}}\frac{\|\mathbf{A}% \mathbf{x}\|_{p}}{\|\mathbf{x}\|_{p}}\,.$ ⓘ Referenced by: §3.2(iii) Permalink: http://dlmf.nist.gov/3.2.E14 Encodings: TeX, pMML, png See also: Annotations for 3.2(iii), 3.2 and 3

The cases $p=1,2$, and $\infty$ are the most important:

 3.2.15 $\displaystyle\|\mathbf{A}\|_{1}$ $\displaystyle=\max_{1\leq k\leq n}\sum_{j=1}^{n}\left|a_{jk}\right|,$ $\displaystyle\|\mathbf{A}\|_{\infty}$ $\displaystyle=\max_{1\leq j\leq n}\sum_{k=1}^{n}\left|a_{jk}\right|,$ $\displaystyle\|\mathbf{A}\|_{2}$ $\displaystyle=\sqrt{\rho(\mathbf{A}\mathbf{A}^{\rm T})},$ ⓘ Permalink: http://dlmf.nist.gov/3.2.E15 Encodings: TeX, TeX, TeX, pMML, pMML, pMML, png, png, png See also: Annotations for 3.2(iii), 3.2 and 3

where $\rho(\mathbf{A}\mathbf{A}^{\rm T})$ is the largest of the absolute values of the eigenvalues of the matrix $\mathbf{A}\mathbf{A}^{\rm T}$; see §3.2(iv). (We are assuming that the matrix $\mathbf{A}$ is real; if not $\mathbf{A}^{\rm T}$ is replaced by $\mathbf{A}^{\rm H}$, the transpose of the complex conjugate of $\mathbf{A}$.)

The sensitivity of the solution vector $\mathbf{x}$ in (3.2.1) to small perturbations in the matrix $\mathbf{A}$ and the vector $\mathbf{b}$ is measured by the condition number

 3.2.16 $\kappa(\mathbf{A})=\|\mathbf{A}\|_{p}\;\|\mathbf{A}^{-1}\|_{p},$ ⓘ Defines: $\kappa(\mathbf{A})$: condition number (locally) Permalink: http://dlmf.nist.gov/3.2.E16 Encodings: TeX, pMML, png See also: Annotations for 3.2(iii), 3.2 and 3

where $\|\,\cdot\,\|_{p}$ is one of the matrix norms. For any norm (3.2.14) we have $\kappa(\mathbf{A})\geq 1$. The larger the value $\kappa(\mathbf{A})$, the more ill-conditioned the system.

Let $\mathbf{x}^{*}$ denote a computed solution of the system (3.2.1), with $\mathbf{r}=\mathbf{b}-\mathbf{A}\mathbf{x}^{*}$ again denoting the residual. Then we have the a posteriori error bound

 3.2.17 $\frac{\|\mathbf{x}^{*}-\mathbf{x}\|_{p}}{\|\mathbf{x}\|_{p}}\leq\kappa(\mathbf% {A})\frac{\|\mathbf{r}\|_{p}}{\|\mathbf{b}\|_{p}}.$ ⓘ Symbols: $\kappa(\mathbf{A})$: condition number Permalink: http://dlmf.nist.gov/3.2.E17 Encodings: TeX, pMML, png See also: Annotations for 3.2(iii), 3.2 and 3

For further information see Brezinski (1999) and Trefethen and Bau (1997, Chapter 3).

## §3.2(iv) Eigenvalues and Eigenvectors

If $\mathbf{A}$ is an $n\times n$ matrix, then a real or complex number $\lambda$ is called an eigenvalue of $\mathbf{A}$, and a nonzero vector $\mathbf{x}$ a corresponding (right) eigenvector, if

 3.2.18 $\mathbf{A}\mathbf{x}=\lambda\mathbf{x}.$ ⓘ Symbols: $\lambda$: eigenvalue Permalink: http://dlmf.nist.gov/3.2.E18 Encodings: TeX, pMML, png See also: Annotations for 3.2(iv), 3.2 and 3

A nonzero vector $\mathbf{y}$ is called a left eigenvector of $\mathbf{A}$ corresponding to the eigenvalue $\lambda$ if $\mathbf{y}^{\rm T}\mathbf{A}=\lambda\mathbf{y}^{\rm T}$ or, equivalently, $\mathbf{A}^{\rm T}\mathbf{y}=\lambda\mathbf{y}$. A normalized eigenvector has Euclidean norm 1; compare (3.2.13) with $p=2$.

The polynomial

 3.2.19 $p_{n}(\lambda)=\det[\lambda\mathbf{I}-\mathbf{A}]$ ⓘ Symbols: $\det$: determinant and $\lambda$: eigenvalue Permalink: http://dlmf.nist.gov/3.2.E19 Encodings: TeX, pMML, png See also: Annotations for 3.2(iv), 3.2 and 3

is called the characteristic polynomial of $\mathbf{A}$ and its zeros are the eigenvalues of $\mathbf{A}$. The multiplicity of an eigenvalue is its multiplicity as a zero of the characteristic polynomial (§3.8(i)). To an eigenvalue of multiplicity $m$, there correspond $m$ linearly independent eigenvectors provided that $\mathbf{A}$ is nondefective, that is, $\mathbf{A}$ has a complete set of $n$ linearly independent eigenvectors.

## §3.2(v) Condition of Eigenvalues

If $\mathbf{A}$ is nondefective and $\lambda$ is a simple zero of $p_{n}(\lambda)$, then the sensitivity of $\lambda$ to small perturbations in the matrix $\mathbf{A}$ is measured by the condition number

 3.2.20 $\kappa(\lambda)=\frac{1}{\left|\mathbf{y}^{\rm T}\mathbf{x}\right|},$ ⓘ Symbols: $\kappa(\mathbf{A})$: condition number and $\lambda$: eigenvalue Permalink: http://dlmf.nist.gov/3.2.E20 Encodings: TeX, pMML, png See also: Annotations for 3.2(v), 3.2 and 3

where $\mathbf{x}$ and $\mathbf{y}$ are the normalized right and left eigenvectors of $\mathbf{A}$ corresponding to the eigenvalue $\lambda$. Because $\left|\mathbf{y}^{\rm T}\mathbf{x}\right|=\left|\cos\theta\right|$, where $\theta$ is the angle between $\mathbf{y}^{\rm T}$ and $\mathbf{x}$ we always have $\kappa(\lambda)\geq 1$. When $\mathbf{A}$ is a symmetric matrix, the left and right eigenvectors coincide, yielding $\kappa(\lambda)=1$, and the calculation of its eigenvalues is a well-conditioned problem.

## §3.2(vi) Lanczos Tridiagonalization of a Symmetric Matrix

Define the Lanczos vectors $\mathbf{v}_{j}$ by $\mathbf{v}_{0}=\boldsymbol{{0}}$, a normalized vector $\mathbf{v}_{1}$ (perhaps chosen randomly), and for $j=1,2,\dots,n-1$,

 3.2.21 $\displaystyle\beta_{j+1}\mathbf{v}_{j+1}$ $\displaystyle=\mathbf{A}\mathbf{v}_{j}-\alpha_{j}\mathbf{v}_{j}-\beta_{j}% \mathbf{v}_{j-1},$ $\displaystyle\alpha_{j}$ $\displaystyle=\mathbf{v}_{j}^{\rm T}\mathbf{A}\mathbf{v}_{j},$ $\displaystyle\beta_{j+1}$ $\displaystyle=\mathbf{v}_{j+1}^{\rm T}\mathbf{A}\mathbf{v}_{j}.$ ⓘ Defines: $\alpha_{j}$: matrix (locally) and $\beta_{j}$: matrix (locally) Permalink: http://dlmf.nist.gov/3.2.E21 Encodings: TeX, TeX, TeX, pMML, pMML, pMML, png, png, png See also: Annotations for 3.2(vi), 3.2 and 3

Then all $\mathbf{v}_{j}$, $1\leq j\leq n$, are normalized and $\mathbf{v}_{j}^{\rm T}\mathbf{v}_{k}=0$ for $j,k=1,2,\dots,n$, $j\neq k$. The tridiagonal matrix

 3.2.22 $\mathbf{B}=\begin{bmatrix}\alpha_{1}&\beta_{2}&&&0\\ \beta_{2}&\alpha_{2}&\beta_{3}&&\\ &\ddots&\ddots&\ddots&\\ &&\beta_{n-1}&\alpha_{n-1}&\beta_{n}\\ 0&&&\beta_{n}&\alpha_{n}\end{bmatrix}$ ⓘ Symbols: $\alpha_{j}$: matrix and $\beta_{j}$: matrix Permalink: http://dlmf.nist.gov/3.2.E22 Encodings: TeX, pMML, png See also: Annotations for 3.2(vi), 3.2 and 3

has the same eigenvalues as $\mathbf{A}$. Its characteristic polynomial can be obtained from the recursion

 3.2.23 $p_{k+1}(\lambda)=(\lambda-\alpha_{k+1})p_{k}(\lambda)-\beta_{k+1}^{2}p_{k-1}(% \lambda),$ $k=0,1,\dots,n-1$, ⓘ Defines: $p_{k}(\lambda)$: characteristic polynomial (locally) Symbols: $\alpha_{j}$: matrix, $\beta_{j}$: matrix and $\lambda$: eigenvalue Permalink: http://dlmf.nist.gov/3.2.E23 Encodings: TeX, pMML, png See also: Annotations for 3.2(vi), 3.2 and 3

with $p_{-1}(\lambda)=0$, $p_{0}(\lambda)=1$.

For numerical information see Stewart (2001, pp. 347–368).

## §3.2(vii) Computation of Eigenvalues

Many methods are available for computing eigenvalues; see Golub and Van Loan (1996, Chapters 7, 8), Trefethen and Bau (1997, Chapter 5), and Wilkinson (1988, Chapters 8, 9).