NAG Library f11 - Large Scale Linear Systems
|
Scope of the Chapter
|
|
This chapter provides functions for the solution of large sparse systems of simultaneous linear equations. These include iterative methods for real nonsymmetric and symmetric. Direct methods are currently available in Chapters f01 and f04.
For a wider selection of functions for sparse linear algebra, users are referred to the Harwell Sparse Matrix Library (available from NAG), especially for direct methods for solving linear systems.
|
|
Background to the Problems
|
|
This section is only a brief introduction to the solution of sparse linear systems. For a more detailed discussion see for example Duff et al. (1986) for direct methods, or Barrett et al. (1994) for iterative methods.
|
Sparse Matrices and Their Storage
|
|
A matrix may be described as sparse if the number of zero elements is sufficiently large that it is worthwhile using algorithms which avoid computations involving zero elements.
If is sparse, and the chosen algorithm requires the matrix coefficients to be stored, a significant saving in storage can often be made by storing only the non-zero elements. A number of different formats may be used to represent sparse matrices economically. These differ according to the amount of storage required, the amount of indirect addressing required for fundamental operations such as matrix–vector products, and their suitability for vector and/or parallel architectures. For a survey of some of these storage formats see Barrett et al. (1994).
Some of the functions in this chapter have been designed to be independent of the matrix storage format. This allows users to choose their own preferred format, or to avoid storing the matrix altogether. Other functions are the so-called black-boxes, which are easier to use, but are based on fixed storage formats. Three such formats are currently provided. These are known as coordinate storage (CS) format, symmetric coordinate storage (SCS) format and compressed column storage (CCS) format.
|
Coordinate storage (CS) format
|
|
This storage format represents a sparse matrix , with non-zero elements, in terms of three one-dimensional arrays — a real or complex array and two integer arrays and . These arrays are all of dimension at least . contains the non-zero elements themselves, while and store the corresponding row and column indices respectively.
For example, the matrix
might be represented in the arrays , and as
|
.
|
Notes
|
The general format specifies no ordering of the array elements, but some functions may impose a specific ordering. For example, the non-zero elements may be required to be ordered by increasing row index and by increasing column index within each row, as in the example above. A utility function is provided to order the elements appropriately.
|
|
With this storage format it is possible to enter duplicate elements. These may be interpreted in various ways (raising an error, ignoring all but the first entry, all but the last, or summing, for example).
|
|
|
Symmetric coordinate storage (SCS) format
|
|
This storage format is suitable for symmetric and Hermitian matrices, and is identical to the CS format described in Section [Coordinate storage (CS) format], except that only the lower triangular non-zero elements are stored. Thus, for example, the matrix
might be represented in the arrays , and as
|
.
|
|
|
|
Iterative Methods
|
|
iterative methods for
(1)
approach the solution through a sequence of approximations until some user-specified termination criterion is met or until some predefined maximum number of iterations has been reached. The number of iterations required for convergence is not generally known in advance, as it depends on the accuracy required, and on the matrix — its sparsity pattern, conditioning and eigenvalue spectrum.
Faster convergence can often be achieved using a preconditioner (see Golub and Van Loan (1996) and Barrett et al. (1994)). A preconditioner maps the original system of equations onto a different system
(2)
which hopefully exhibits better convergence characteristics: for example, the condition number of the matrix may be better than that of , or it may have eigenvalues of greater multiplicity.
An unsuitable preconditioner or no preconditioning at all may result in a very slow rate or lack of convergence. However, preconditioning involves a trade-off between the reduction in the number of iterations required for convergence and the additional computational costs per iteration. Also, setting up a preconditioner may involve non-negligible overheads. The application of preconditioners to real nonsymmetric, complex non-Hermitian, real symmetric and complex Hermitian systems of equations is further considered in Sections [Iterative Methods for Real Nonsymmetric Linear Systems] and [Iterative Methods for Real Symmetric Linear Systems].
|
|
Iterative Methods for Real Nonsymmetric Linear Systems
|
|
Many of the most effective iterative methods for the solution of (1) lie in the class of non-stationary Krylov subspace methods (see Barrett et al. (1994)). For real nonsymmetric matrices this class includes:
|
the restarted generalized minimum residual (RGMRES) method (see Saad and Schultz (1986));
|
|
the conjugate gradient squared (CGS) method (see Sonneveld (1989));
|
|
the polynomial stabilized bi-conjugate gradient (Bi-CGSTAB ) method (see Van der Vorst (1989) and Sleijpen and Fokkema (1993));
|
Here we just give a brief overview of these algorithms as implemented in this chapter.
RGMRES is based on the Arnoldi method, which explicitly generates an orthogonal basis for the Krylov subspace , , where is the initial residual. The solution is then expanded onto the orthogonal basis so as to minimize the residual norm. For real nonsymmetric and complex non-Hermitian matrices the generation of the basis requires a "long" recurrence relation, resulting in prohibitive computational and storage costs. RGMRES limits these costs by restarting the Arnoldi process from the latest available residual every iterations. The value of is chosen in advance and is fixed throughout the computation. Unfortunately, an optimum value of cannot easily be predicted.
CGS is a development of the bi-conjugate gradient method where the nonsymmetric Lanczos method is applied to reduce the coefficient matrix to tri-diagonal form: two bi-orthogonal sequences of vectors are generated starting from the initial residual and from the shadow residual corresponding to the arbitrary problem , where is chosen so that . In the course of the iteration, the residual and shadow residual and are generated, where is a polynomial of order , and bi-orthogonality is exploited by computing the vector product . Applying the "contraction" operator twice, the iteration coefficients can still be recovered without advancing the solution of the shadow problem, which is of no interest. The CGS method often provides fast convergence; however, there is no reason why the contraction operator should also reduce the once reduced vector : this can lead to a highly irregular convergence.
Bi-CGSTAB is similar to the CGS method. However, instead of generating the sequence , it generates the sequence where the are polynomials chosen to minimize the residual after the application of the contraction operator . Two main steps can be identified for each iteration: an OR (Orthogonal Residuals) step where a basis of order is generated by a Bi-CG iteration and an MR (Minimum Residuals) step where the residual is minimized over the basis generated, by a method akin to GMRES. For , the method corresponds to the Bi-CGSTAB method of Van der Vorst (1989). For , more information about complex eigenvalues of the iteration matrix can be taken into account, and this may lead to improved convergence and robustness. However, as increases, numerical instabilities may arise.
Faster convergence can usually be achieved by using a preconditioner. A left preconditioner can be used by the RGMRES, CGS and TFQMR methods, such that in (2), where is the identity matrix of order ; a right preconditioner can be used by the Bi-CGSTAB method, such that . These are formal definitions, used only in the design of the algorithms; in practice, only the means to compute the matrix–vector products and (the latter only being required when an estimate of or is computed internally), and to solve the preconditioning equations are required, that is, explicit information about , or its inverse is not required at any stage.
Preconditioning matrices are typically based on incomplete factorizations (see Meijerink and Van der Vorst (1981)), or on the approximate inverses occurring in stationary iterative methods (see Young (1971)). A common example is the incomplete factorization
where is lower triangular with unit diagonal elements, is diagonal, is upper triangular with unit diagonals, and are permutation matrices, and is a remainder matrix. A zero-fill incomplete factorization is one for which the matrix
has the same pattern of non-zero entries as . This is obtained by discarding any fill elements (non-zero elements of arising during the factorization in locations where has zero elements). Allowing some of these fill elements to be kept rather than discarded generally increases the accuracy of the factorization at the expense of some loss of sparsity. For further details see Barrett et al. (1994).
|
|
Iterative Methods for Real Symmetric Linear Systems
|
|
Two of the best known iterative methods applicable to real symmetric and complex Hermitian linear systems are the conjugate gradient (CG) method (see Hestenes and Stiefel (1952) and Golub and Van Loan (1996)) and a Lanczos type method based on SYMMLQ (see Paige and Saunders (1975)).
For the CG method the matrix should ideally be positive-definite. The application of CG to indefinite matrices may lead to failure, or to lack of convergence. The SYMMLQ method is suitable for both positive-definite and indefinite symmetric matrices. It is more robust than CG, but less efficient when is positive-definite.
Both methods start from the residual , where is an initial estimate for the solution (often ), and generate an orthogonal basis for the Krylov subspace , for by means of three-term recurrence relations (see Golub and Van Loan (1996)). A sequence of symmetric tri-diagonal matrices is also generated. Here and in the following, the index denotes the iteration count. The resulting symmetric tri-diagonal systems of equations are usually more easily solved than the original problem. A sequence of solution iterates is thus generated such that the sequence of the norms of the residuals converges to a required tolerance. Note that, in general, the convergence is not monotonic.
In exact arithmetic, after iterations, this process is equivalent to an orthogonal reduction of to symmetric tri-diagonal form, ; the solution would thus achieve exact convergence. In finite-precision arithmetic, cancellation and round-off errors accumulate causing loss of orthogonality. These methods must therefore be viewed as genuinely iterative methods, able to converge to a solution within a prescribed tolerance.
The orthogonal basis is not formed explicitly in either method. The basic difference between the two methods lies in the method of solution of the resulting symmetric tri-diagonal systems of equations: the CG method is equivalent to carrying out an (Cholesky) factorization whereas the Lanczos method (SYMMLQ) uses an factorization.
A preconditioner for these methods must be symmetric and positive-definite, i.e., representable by , where is non-singular, and such that in (2), where is the identity matrix of order . These are formal definitions, used only in the design of the algorithms; in practice, only the means to compute the matrix-vector products and to solve the preconditioning equations are required.
Preconditioning matrices are typically based on incomplete factorizations (see Meijerink and Van der Vorst (1977)), or on the approximate inverses occurring in stationary iterative methods (see Young (1971)). A common example is the incomplete Cholesky factorization
where is a permutation matrix, is lower triangular with unit diagonal elements, is diagonal and is a remainder matrix. A zero-fill incomplete Cholesky factorization is one for which the matrix
has the same pattern of non-zero entries as . This is obtained by discarding any fill elements (non-zero elements of arising during the factorization in locations where has zero elements). Allowing some of these fill elements to be kept rather than discarded generally increases the accuracy of the factorization at the expense of some loss of sparsity. For further details see Barrett et al. (1994).
|
|
|
Recommendations on Choice and Use of Available Functions
|
|
|
Routines for Nonsymmetric Linear Systems
|
|
All routines for nonsymmetric linear systems in this chapter use the coordinate storage (CS) format described in Section [Coordinate storage (CS) format].
In general it is not possible to recommend one of these methods in preference to another. RMGRES is popular but requires the most storage, and can easily stagnate when the size of the orthogonal basis is too small, or the preconditioner is not good enough. CGS can be the fastest method, but the computed residuals can exhibit instability which may greatly affect the convergence and quality of the solution. Bi-CGSTAB( ) seems robust and reliable, but it can be slower than the other methods. Some further discussion of the relative merits of these methods can be found in Barrett et al. (1994).
Function f11dac (nag_sparse_nsym_fac) computes a preconditioning matrix based on incomplete LU factorisation. The amount of fill-in occurring in the incomplete factorisation can be controlled by specifying either the level of fill or the drop tolerance. Partial or complete pivoting may optionally be employed and the factorisation can be modified to preserve row-sums.
Function f11dcc (nag_sparse_nsym_fac_sol) uses the incomplete preconditioning matrix generated by f11dac (nag_sparse_nsym_fac) to solve a sparse nonsymmetric linear system, using RMGRES, CGS, or Bi-CGSTAB( ).
Function f11dec (nag_sparse_nsym_sol) is similar to f11dcc (nag_sparse_nsym_fac_sol) but has options for no preconditioning, SSOR preconditioning or Jacobi preconditioning.
The utility function f11zac (nag_sparse_nsym_sort) orders the non-zero elements of a sparse nonsymmetric matrix stored in general CS format.
|
|
|
Decision Tree
|
|
|
Tree 1
|
|
Q:1 Symmetric positive definite?
Q:2 Incomplete Cholesky preconditioner?
Q:3 Incomplete LU preconditioner?
|
|
|
Index
|
|
Black Box functions for real nonsymmetric linear systems,
RGMRES,
CGS,
Bi-CGSTAB or TFQMR solver with incomplete preconditioning: f11dcc (nag_sparse_nsym_fac_sol)
Bi-CGSTAB or TFQMR solver with no preconditioning,
Jacobi,
or SSOR preconditioning: f11dec (nag_sparse_nsym_sol)
Black Box functions for real symmetric linear systems,
CG or SYMMLQ solver with incomplete Cholesky preconditioning: f11jcc (nag_sparse_sym_chol_sol)
CG or SYMMLQ solver with no preconditioning,
Jacobi,
or SSOR preconditioning: f11jec (nag_sparse_sym_sol)
Utility function for real nonsymmetric linear systems,
Incomplete factorization: f11dac (nag_sparse_nsym_fac)
Sort function for real nonsymmetric matrices in CS format: f11zac (nag_sparse_nsym_sort)
Utility function for real symmetric linear systems,
Incomplete Cholesky factorization: f11jac (nag_sparse_sym_chol_fac)
Sort function for real symmetric matrices in SCS format: f11zbc (nag_sparse_sym_sort)
|
|
See Also
|
|
Duff I S, Erisman A M and Reid J K (1986) Direct Methods for Sparse Matrices Oxford University Press, London
Barrett R, Berry M, Chan T F, Demmel J, Donato J, Dongarra J, Eijkhout V, Pozo R, Romine C and Van der Vorst H (1994) Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods SIAM, Philadelphia
Demmel J W, Eisenstat S C, Gilbert J R, Li X S and Li J W H (1999) A Supernodal Approach to Sparse Partial Pivoting SIAM J. Matrix Anal. Appl. 20 720–755 http://citeseer.nj.nec.com/demmel95supernodal.html
Freund R W and Nachtigal N (1991) QMR: a Quasi-Minimal Residual Method for Non-Hermitian Linear Systems Numer. Math. 60 315–339
Freund R W (1993) A transpose-free quasi-mimimal residual algorithm for non-Hermitian linear systems SIAM J. Sci. Comput. 14 470–482
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hestenes M and Stiefel E (1952) Methods of conjugate gradients for solving linear systems J. Res. Nat. Bur. Stand. 49 409–436
Meijerink J and Van der Vorst H (1977) An iterative solution method for linear systems of which the coefficient matrix is a symmetric M-matrix Math. Comput. 31 148–162
Meijerink J and Van der Vorst H (1981) Guidelines for the usage of incomplete decompositions in solving sets of linear equations as they occur in practical problems J. Comput. Phys. 44 134–155
Paige C C and Saunders M A (1975) Solution of sparse indefinite systems of linear equations SIAM J. Numer. Anal. 12 617–629
Saad Y and Schultz M (1986) GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear systems SIAM J. Sci. Statist. Comput. 7 856–869
Sleijpen G L G and Fokkema D R (1993) BiCGSTAB for linear equations involving matrices with complex spectrum ETNA 1 11–32
Sonneveld P (1989) CGS, a fast Lanczos-type solver for nonsymmetric linear systems SIAM J. Sci. Statist. Comput. 10 36–52
Van der Vorst H (1989) Bi-CGSTAB, a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems SIAM J. Sci. Statist. Comput. 13 631–644
Young D (1971) Iterative Solution of Large Linear Systems Academic Press, New York
NAG Toolbox Overview.
NAG Web Site.
|
|