NAG Library f04 - Simultaneous Linear Equations
|
Scope of the Chapter
|
|
This chapter is concerned with the solution of the matrix equation , where may be a single vector or a matrix of multiple right-hand sides. The matrix may be real, complex, symmetric, Hermitian, positive-definite, positive-definite Toeplitz or banded. It may also be rectangular, in which case a least-squares solution is obtained.
For a general introduction to sparse systems of equations, see the f11 Chapter Introduction, which provides functions for sparse systems.
|
|
Background to the Problems
|
|
A set of linear equations may be written in the form
where the known matrix , with real or complex coefficients, is of size by , ( rows and columns), the known right-hand vector has components ( rows and one column), and the required solution vector has components ( rows and one column). There may also be vectors , on the right-hand side and the equations may then be written as
the required matrix having as its columns the solutions of , . Some functions deal with the latter case, but for clarity only the case is discussed here.
The most common problem, the determination of the unique solution of , occurs when and is not singular, that is . This is discussed in Section [Unique Solution of Ax = b] below. The next most common problem, discussed in Section [The Least-squares Solution of Ax x b, m > n, rank(A) = n] below, is the determination of the least-squares solution of required when and , i.e., the determination of the vector which minimizes the norm of the residual vector . All other cases are rank deficient, and they are treated in Section [Rank-deficient Cases].
Most of the functions of the chapter are based on those published in the book edited by Wilkinson and Reinsch (1971). We are very grateful to the late Dr J H Wilkinson FRS for his help and interest during the implementation of this chapter of the Library.
|
Unique Solution of
|
|
Most functions in this chapter solve this particular problem. The computation starts with the triangular decomposition , where and are respectively lower and upper triangular matrices and is a permutation matrix, chosen so as to ensure that the decomposition is numerically stable. The solution is then obtained by solving in succession the simpler equations
the first by forward-substitution and the second by back-substitution.
If is real symmetric and positive-definite, , while if is complex Hermitian and positive-definite, ; in both these cases is the identity matrix (i.e., no permutations are necessary). In all other cases either or has unit diagonal elements.
Due to rounding errors the computed "solution" , say, is only an approximation to the true solution . This approximation will sometimes be satisfactory, agreeing with to several figures, but if the problem is ill-conditioned then and may have few or even no figures in common, and at this stage there is no means of estimating the "accuracy" of .
It must be emphasised that the "true" solution may not be meaningful, that is correct to all figures quoted, if the elements of and are known with certainty only to say figures, where is smaller than the word-length of the computer.
One approach to assessing the accuracy of the solution is to compute or estimate the condition number of , defined as
Roughly speaking, errors or uncertainties in or may be amplified in the solution by a factor . Thus, for example, if the data in and are only accurate to 5 digits and , then the solution cannot be guaranteed to have more than 2 correct digits. If , the solution may have no meaningful digits.
To be more precise, suppose that
Here and represent perturbations to the matrices and which cause a perturbation in the solution. We can define measures of the relative sizes of the perturbations in , and as
Then
provided that . Often and then the bound effectively simplifies to
Hence, if we know , and , we can compute a bound on the relative errors in the solution. Note that , and are defined in terms of the norms of , and . If , or contains elements of widely differing magnitude, then , and will be dominated by the errors in the larger elements, and will give no information about the relative accuracy of smaller elements of .
Another way to obtain useful information about the accuracy of a solution is to solve two sets of equations, one with the given coefficients, which are assumed to be known with certainty to figures, and one with the coefficients rounded to () figures, and to count the number of figures to which the two solutions agree. In ill-conditioned problems this can be surprisingly small and even zero.
Chapter f07 contains functions for estimating condition numbers and for returning error bounds.
|
|
The Least-squares Solution of , ,
|
|
The least-squares solution is the vector which minimizes the sum of the squares of the residuals,
The solution is obtained in two steps.
|
The required least-squares solution is obtained by back-substitution in the equation
|
Again due to rounding errors the computed is only an approximation to the required .
|
|
Rank-deficient Cases
|
|
If, in the least-squares problem just discussed, , then a least-squares solution exists but it is not unique. In this situation it is usual to ask for the least-squares solution "of minimal length", i.e., the vector which minimizes , among all those for which is a minimum.
This can be computed from the Singular Value Decomposition (SVD) of , in which is factorized as
where is an by matrix with orthonormal columns, is an by orthogonal matrix and is an by diagonal matrix. The diagonal elements of are called the "singular values" of ; they are non-negative and can be arranged in decreasing order of magnitude:
The columns of and are called respectively the left and right singular vectors of . If the singular values are zero or negligible, but is not negligible, then the rank of is taken to be (see also Section [The Rank of a Matrix]) and the minimal length least-squares solution of is given by
where is the diagonal matrix with diagonal elements .
The SVD may also be used to find solutions to the homogeneous system of equations , where is by . Such solutions exist if and only if , and are given by
where the are arbitrary numbers and the are the columns of which correspond to negligible elements of .
The general solution to the rank-deficient least-squares problem is given by , where is the minimal length least-squares solution and is any solution of the homogeneous system of equations .
|
|
The Rank of a Matrix
|
|
In theory the rank is if elements of the diagonal matrix of the singular value decomposition are exactly zero. In practice, due to rounding and/or experimental errors, some of these elements have very small values which usually can and should be treated as zero.
For example, the following 5 by 8 matrix has rank 3 in exact arithmetic:
On a computer with 7 decimal digits of precision the computed singular values were
and the rank would be correctly taken to be 3.
It is not, however, always certain that small computed singular values are really zero. With the 7 by 7 Hilbert matrix, for example, where , the singular values are
Here there is no clear cut-off between small (i.e., negligible) singular values and larger ones. In fact, in exact arithmetic, the matrix is known to have full rank and none of its singular values is zero. On a computer with 7 decimal digits of precision, the matrix is effectively singular, but should its rank be taken to be 6, or 5, or 4?
It is therefore impossible to give an infallible rule, but generally the rank can be taken to be the number of singular values which are neither zero nor very small compared with other singular values. For example, if there is a sharp decrease in singular values from numbers of order unity to numbers of order , then the latter will almost certainly be zero in a machine in which 7 significant decimal figures is the maximum accuracy. Similarly for a least-squares problem in which the data is known to about four significant figures and the largest singular value is of order unity then a singular value of order or less should almost certainly be regarded as zero.
It should be emphasised that rank determination and least-squares solutions can be sensitive to the scaling of the matrix. If at all possible the units of measurement should be chosen so that the elements of the matrix have data errors of approximately equal magnitude.
|
|
Generalized Linear Least-squares Problems
|
|
The simple type of linear least-squares problem described in Section [The Least-squares Solution of Ax x b, m > n, rank(A) = n] can be generalized in various ways.
|
Linear least-squares problems with equality constraints:
|
|
General Gauss–Markov linear model problems:
|
|
|
Calculating the Inverse of a Matrix
|
|
The functions in this chapter can also be used to calculate the inverse of a square matrix by solving the equation
where is the identity matrix. However, solving the equations by calculation of the inverse of the coefficient matrix , i.e., by , is definitely not recommended.
Similar remarks apply to the calculation of the pseudo inverse of a singular or rectangular matrix.
|
|
|
Recommendations on Choice and Use of Available Functions
|
|
|
Black Box and General Purpose Functions
|
|
Most of the functions in this chapter are categorised as Black Box functions or general purpose functions.
Black Box functions solve the equations , for , in a single call with the matrix and the right-hand sides being supplied as data. These are the simplest functions to use and are suitable when all the right-hand sides are known in advance and do not occupy too much storage.
General purpose functions, in general, require a previous call to a function in Chapters f01 or f03 to factorize the matrix . This factorization can then be used repeatedly to solve the equations for one or more right-hand sides which may be generated in the course of the computation. The Black Box functions simply call a factorization function and then a general purpose function to solve the equations.
|
|
Systems of Linear Equations
|
|
Most of the functions in this chapter solve linear equations when is by and a unique solution is expected (case ). If this turns out to be untrue the functions go to a failure exit. The matrix may be "general" real or complex, or may have special structure or properties, e.g., it may be banded, tri-diagonal, almost block-diagonal, sparse, symmetric, Hermitian, positive-definite (or various combinations of these). For some of the combinations see Chapter f07. f04mcc (nag_real_cholesky_skyline_solve) (which needs to be preceded by a call to f01mcc (nag_real_cholesky_skyline)) can be used for the solution of variable band-width (skyline) positive-definite systems.
It must be emphasised that it is a waste of computer time and space to use an inappropriate function, for example one for the complex case when the equations are real. It is also unsatisfactory to use the special functions for a positive-definite matrix if this property is not known in advance.
Other functions for solving linear equation systems, computing inverse matrices, and estimating condition numbers can be found in Chapter f07, which contains LAPACK software.
|
|
Linear Least-squares Problems
|
|
Functions for solving linear least-squares problems using the factorization or the SVD can be found in Chapters f01, f02 and f08. When and a unique solution is expected, the factorization can be used, otherwise the factorization with pivoting, or the SVD should be used. For , the SVD is not significantly more expensive than the factorization. See Chapter f08 for further discussion.
Problems with linear equality constraints can be solved by functions in Chapter f08 provided that the problems are of full rank. Problems with linear inequality constraints can be solved by e04ncc (nag_opt_lin_lsq) in Chapter e04.
General Gauss–Markov linear model problems, as formulated in Section [Generalized Linear Least-squares Problems], can be solved by functions in Chapter f08.
|
|
Sparse Matrix Functions
|
|
For the solution of sparse linear equations see Chapter f11.
|
|
|
Decision Trees
|
|
If at any stage the answer to a question is "Don't know" this should be read as "No".
The name of the function (if any) that should be used to factorize the matrix is given in brackets after the name of the function for solving the equations.
|
Black Box functions for unique solution of
|
|
Q:1 Is a real matrix?
|
|
General purpose functions for unique solution of
|
|
Q:1 Is a real matrix?
Q:2 Is a symmetric positive-definite matrix?
Q:3 Is a variable band matrix?
Q:4 Is a Hermitian positive-definite matrix?
|
|
General purpose functions for unique solution of (Complex matrix)
|
|
Q:1 Is Hermitian?
Q:2 Is positive-definite?
Q:3 Is a band matrix?
Q:4 Is one triangle of stored as a linear array?
Q:5 Is one triangle of stored as a linear array?
Q:6 Is symmetric?
Q:7 Is one triangle of stored as a linear array?
Q:8 Is triangular?
Q:9 Is a band matrix?
Q:10 Is stored as a linear array?
Q:11 Is a band matrix?
|
|
|
See Also
|
|
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Lawson C L and Hanson R J (1974) Solving Least-squares Problems Prentice–Hall
Wilkinson J H and Reinsch C (1971) Handbook for Automatic Computation II, Linear Algebra Springer–Verlag
NAG Toolbox Overview.
NAG Web Site.
|
|