|
NAG[e02bec] NAG[nag_1d_spline_fit] - Least-squares cubic spline curve fit, automatic knot placement, one variable
|
|
Calling Sequence
e02bec(start, x, y, weights, s, nest, fp, spline_data, 'm'=m, 'warmstartinf'=warmstartinf, 'fail'=fail)
nag_1d_spline_fit(. . .)
Parameters
|
start - String;
|
|
|
On entry: start must be set to "Nag_Cold" or "Nag_Warm".
|
|
(cold start) The function will build up the knot set starting with no interior knots. No values need be assigned to the argument n, and memory will be allocated internally to lamda, c, nag_w and nag_iw.
|
|
(warm start) The function will restart the knot-placing strategy using the knots found in a previous call of the function. In this case, all arguments except s must be unchanged from that previous call. This warm start can save much time in searching for a satisfactory value of the smoothing factor .
|
|
Constraint: "Nag_Cold" or "Nag_Warm". .
|
|
|
x - Vector(1..m, datatype=float[8]);
|
|
|
Constraint: . .
|
|
|
y - Vector(1..m, datatype=float[8]);
|
|
|
|
weights - Vector(1..m, datatype=float[8]);
|
|
|
Constraint: , for . .
|
|
|
s - float;
|
|
|
On entry: the smoothing factor, .
|
|
If , the function returns an interpolating spline.
|
|
If is smaller than machine precision, it is assumed equal to zero.
|
|
Constraint: . .
|
|
|
nest - integer;
|
|
|
On entry: an over-estimate for the number, , of knots required.
|
|
;
|
|
|
fp - assignable;
|
|
|
Note: On exit the variable fp will have a value of type float.
|
|
|
spline_data - table;
|
|
|
A Maple table, which should be generated using NAG[Nag_Spline], corresponding to the Nag_Spline structure.
|
|
On entry: if the warm start option is used, the value of n must be left unchanged from the previous call.
|
|
On exit: the total number, , of knots of the computed spline.
|
|
|
'm'=m - integer; (optional)
|
|
|
Default value: the first dimension of the arrays x, y, weights.
|
|
On entry: , the number of data points.
|
|
Constraint: . .
|
|
|
'warmstartinf'=warmstartinf - table; (optional)
|
|
|
A Maple table, which should be generated using NAG[Nag_Comm], corresponding to the Nag_Comm structure.
|
|
|
'fail'=fail - table; (optional)
|
|
|
The NAG error argument, see the documentation for NagError.
|
|
|
|
Description
|
|
|
Purpose
|
|
nag_1d_spline_fit (e02bec) computes a cubic spline approximation to an arbitrary set of data points. The knots of the spline are located automatically, but a single argument must be specified to control the trade-off between closeness of fit and smoothness of fit.
|
|
Description
|
|
nag_1d_spline_fit (e02bec) determines a smooth cubic spline approximation to the set of data points , with weights , for .
The spline is given in the B-spline representation
(1)
where denotes the normalized cubic B-spline defined upon the knots .
The total number of these knots and their values are chosen automatically by the function. The knots are the interior knots; they divide the approximation interval into sub-intervals. The coefficients are then determined as the solution of the following constrained minimization problem:
minimize
(2)
subject to the constraint
(3)
where stands for the discontinuity jump in the third order derivative of at the interior knot , denotes the weighted residual , and is a non-negative number to be specified by the user.
The quantity can be seen as a measure of the (lack of) smoothness of , while closeness of fit is measured through . By means of the argument , "the smoothing factor", the user will then control the balance between these two (usually conflicting) properties. If is too large, the spline will be too smooth and signal will be lost (underfit); if is too small, the spline will pick up too much noise (overfit). In the extreme cases the function will return an interpolating spline if is set to zero, and the weighted least-squares cubic polynomial if is set very large. Experimenting with values between these two extremes should result in a good compromise. (See Section [Choice of S ] for advice on choice of .)
The method employed is outlined in Section [Outline of Method Used ] and fully described in Dierckx (1975), Dierckx (1981a) and Dierckx (1982). It involves an adaptive strategy for locating the knots of the cubic spline (depending on the function underlying the data and on the value of ), and an iterative method for solving the constrained minimization problem once the knots have been determined.
Values of the computed spline, or of its derivatives or definite integral, can subsequently be computed by calling e02bbc (nag_1d_spline_evaluate), e02bcc (nag_1d_spline_deriv) or e02bdc (nag_1d_spline_intg), as described in Section [Evaluation of Computed Spline ].
|
|
Error Indicators and Warnings
|
|
|
If the function fails with an error exit of NE_SPLINE_COEFF_CONV or NE_NUM_KNOTS_1D_GT, a spline approximation is returned, but it fails to satisfy the fitting criterion (see (2) and (3)) – perhaps by only a small amount, however.
|
"NE_ALLOC_FAIL"
Dynamic memory allocation failed.
"NE_BAD_PARAM"
On entry, argument start had an illegal value.
"NE_ENUMTYPE_WARM"
at the first call of this function. start must be set to at the first call.
"NE_INT_ARG_LT"
On entry, m must not be less than 4: .
"NE_NOT_STRICTLY_INCREASING"
The sequence x is not strictly increasing: .
"NE_NUM_KNOTS_1D_GT"
The number of knots needed is greater than nest, . If nest is already large, say , this may indicate that possibly s is too small: .
"NE_REAL_ARG_LT"
On entry, s must not be less than 0.0: .
"NE_SF_D_K_CONS"
On entry, , , . Constraint: when .
"NE_SPLINE_COEFF_CONV"
The iterative process has failed to converge. Possibly s is too small: .
"NE_WEIGHTS_NOT_POSITIVE"
On entry, the weights are not strictly positive: .
|
|
Accuracy
|
|
On successful exit, the approximation returned is such that its weighted sum of squared residuals fp is equal to the smoothing factor , up to a specified relative tolerance of 0.001 – except that if , fp may be significantly less than : in this case the computed spline is simply a weighted least-squares polynomial approximation of degree 3, i.e., a spline with no interior knots.
|
|
Further Comments
|
|
|
Timing
|
|
The time taken for a call of nag_1d_spline_fit (e02bec) depends on the complexity of the shape of the data, the value of the smoothing factor , and the number of data points. If nag_1d_spline_fit (e02bec) is to be called for different values of , much time can be saved by setting after the first call.
|
|
Choice of
|
|
If the weights have been correctly chosen (see the the e02 Chapter Introduction), the standard deviation of would be the same for all , equal to , say. In this case, choosing the smoothing factor in the range , as suggested by Reinsch (1967), is likely to give a good start in the search for a satisfactory value. Otherwise, experimenting with different values of will be required from the start, taking account of the remarks in Section [Description].
In that case, in view of computation time and memory requirements, it is recommended to start with a very large value for and so determine the least-squares cubic polynomial; the value returned for fp, call it , gives an upper bound for . Then progressively decrease the value of to obtain closer fits – say by a factor of 10 in the beginning, i.e., , , and so on, and more carefully as the approximation shows more details.
The number of knots of the spline returned, and their location, generally depend on the value of and on the behaviour of the function underlying the data. However, if nag_1d_spline_fit (e02bec) is called with , the knots returned may also depend on the smoothing factors of the previous calls. Therefore if, after a number of trials with different values of and , a fit can finally be accepted as satisfactory, it may be worthwhile to call nag_1d_spline_fit (e02bec) once more with the selected value for but now using . Often, nag_1d_spline_fit (e02bec) then returns an approximation with the same quality of fit but with fewer knots, which is therefore better if data reduction is also important.
|
|
Outline of Method Used
|
|
If , the requisite number of knots is known in advance, i.e., ; the interior knots are located immediately as , for . The corresponding least-squares spline (see e02bac (nag_1d_spline_fit_knots)) is then an interpolating spline and therefore a solution of the problem.
If , a suitable knot set is built up in stages (starting with no interior knots in the case of a cold start but with the knot set found in a previous call if a warm start is chosen). At each stage, a spline is fitted to the data by least-squares (see e02bac (nag_1d_spline_fit_knots)) and , the weighted sum of squares of residuals, is computed. If , new knots are added to the knot set to reduce at the next stage. The new knots are located in intervals where the fit is particularly poor, their number depending on the value of and on the progress made so far in reducing . Sooner or later, we find that and at that point the knot set is accepted. The function then goes on to compute the (unique) spline which has this knot set and which satisfies the full fitting criterion specified by 2 and 3. The theoretical solution has . The function computes the spline by an iterative scheme which is ended when within a relative tolerance of 0.001. The main part of each iteration consists of a linear least-squares computation of special form, done in a similarly stable and efficient manner as in e02bac (nag_1d_spline_fit_knots).
An exception occurs when the function finds at the start that, even with no interior knots , the least-squares spline already has its weighted sum of squares of residuals . In this case, since this spline (which is simply a cubic polynomial) also has an optimal value for the smoothness measure , namely zero, it is returned at once as the (trivial) solution. It will usually mean that has been chosen too large.
For further details of the algorithm and its use, see Dierckx (1981a).
|
|
Evaluation of Computed Spline
|
|
The value of the computed spline at a given value x may be obtained in the variable sval by the call:
e02bbc(x, sval, spline)
where spline_data is a structure of type which is the output argument of nag_1d_spline_fit (e02bec).
The values of the spline and its first three derivatives at a given value x may be obtained in the array sdif of dimension at least 4 by the call:
e02bcc(derivs, x, sdif, spline)
where, if , left-hand derivatives are computed and, if , right-hand derivatives are calculated. The value of derivs is only relevant if x is an interior knot.
The value of the definite integral of the spline over the interval to can be obtained in the variable sint by the call:
e02bdc(spline, sint)
|
|
|
|
Examples
|
|
>
|
start := "Nag_Cold":
m := 15:
s := 1:
nest := 19:
warmstartinf := NAG:-Nag_Comm():
spline_data := NAG:-Nag_Spline():
x := Vector([0, 0.5, 1, 1.5, 2, 2.5, 3, 4, 4.5, 5, 5.5, 6, 7, 7.5, 8], datatype=float[8]):
y := Vector([-1.1, -0.372, 0.431, 1.69, 2.11, 3.1, 4.23, 4.35, 4.81, 4.61, 4.79, 5.23, 6.35, 7.19, 7.97], datatype=float[8]):
weights := Vector([1, 2, 1.5, 1, 3, 1, 0.5, 1, 2, 2.5, 1, 3, 1, 2, 1], datatype=float[8]):
NAG:-e02bec(start, x, y, weights, s, nest, fp, spline_data, 'm' = m,
'warmstartinf' = warmstartinf):
spline_data['n']-6:
seq(spline_data['lamda'][j],j=4..spline_data['n']-1):
for j from 1 to spline_data['n']-4 do
j, spline_data['c'][j]:
end do:
fp:
start := "Nag_Warm":
s := 0.5:
NAG:-e02bec(start, x, y, weights, s, nest, fp, spline_data, 'm' = m,
'warmstartinf' = warmstartinf):
spline_data['n']-6:
seq(spline_data['lamda'][j],j=4..spline_data['n']-1):
for j from 1 to spline_data['n']-4 do
j, spline_data['c'][j]:
end do:
fp:
s := 0.1:
NAG:-e02bec(start, x, y, weights, s, nest, fp, spline_data, 'm' = m,
'warmstartinf' = warmstartinf):
spline_data['n']-6:
seq(spline_data['lamda'][j],j=4..spline_data['n']-1):
for j from 1 to spline_data['n']-4 do
j, spline_data['c'][j]:
end do:
fp:
|
|
|
See Also
|
|
Dierckx P (1975) An algorithm for smoothing, differentiating and integration of experimental data using spline functions J. Comput. Appl. Math. 1 165–184
Dierckx P (1981a) An improved algorithm for curve fitting with spline functions Report TW54 Department of Computer Science, Katholieke Univerciteit Leuven
Dierckx P (1982) A fast algorithm for smoothing data on a rectangular grid while using spline functions SIAM J. Numer. Anal. 19 1286–1304
Reinsch C H (1967) Smoothing by spline functions Numer. Math. 10 177–183
e02 Chapter Introduction.
NAG Toolbox Overview.
NAG Web Site.
|
|