|
NAG[g02hdc] NAG[nag_robust_m_regsn_user_fn] - Robust regression, compute regression with user-supplied functions and weights
|
|
Calling Sequence
g02hdc(chi, psi, psip0, beta, regtype, sigma_est, x, y, wgt, theta, k, sigma, rs, tol, eps, maxit, nitmon, nit, 'n'=n, 'm'=m, 'outfile'=outfile, 'comm'=comm, 'fail'=fail)
nag_robust_m_regsn_user_fn(. . .)
Parameters
|
chi - procedure;
|
|
|
t - float;
|
|
|
On entry: the argument for which chi must be evaluated.
|
|
|
comm - table;
|
|
|
A Maple table, which should be generated using NAG[Nag_Comm], corresponding to the Nag_Comm structure.
|
|
Before calling nag_robust_m_regsn_user_fn (g02hdc) this field may be initialized for use by chi when called from nag_robust_m_regsn_user_fn (g02hdc).
|
|
|
chi is required only if , otherwise it can be specified as a pointer with 0 value.
|
|
|
psi - procedure;
|
|
|
psi must return the value of the weight function for a given value of its argument.
|
|
t - float;
|
|
|
On entry: the argument for which psi must be evaluated.
|
|
|
comm - table;
|
|
|
A Maple table, which should be generated using NAG[Nag_Comm], corresponding to the Nag_Comm structure.
|
|
Before calling nag_robust_m_regsn_user_fn (g02hdc) this field may be initialized for use by psi when called from nag_robust_m_regsn_user_fn (g02hdc).
|
|
|
|
psip0 - float;
|
|
|
On entry: the value of .
|
|
|
regtype - String;
|
|
|
On entry: determines the type of regression to be performed.
|
|
Huber type regression.
|
|
Mallows type regression.
|
|
Schweppe type regression.
|
|
|
sigma_est - String;
|
|
|
On entry: determines how is to be estimated.
|
|
is estimated by median absolute deviation of residuals.
|
|
is held constant at its initial value.
|
|
|
x - Matrix(1..dim1, 1..m, datatype=float[8], order=order);
|
|
|
Note: this array may be supplied in Fortran_order or C_order , as specified by order. All array parameters must use a consistent order.
|
|
If , during calculations the elements of x will be transformed as described in Section [Description]. Before exit the inverse transformation will be applied. As a result there may be slight differences between the input x and the output x.
|
|
On exit: unchanged, except as described above.
|
|
|
y - Vector(1..n, datatype=float[8]);
|
|
|
On entry: the data values of the dependent variable.
|
|
If , during calculations the elements of y will be transformed as described in Section [Description]. Before exit the inverse transformation will be applied. As a result there may be slight differences between the input y and the output y.
|
|
On exit: unchanged, except as described above.
|
|
|
wgt - Vector(1..n, datatype=float[8]);
|
|
|
On entry: the weight for the th observation, for .
|
|
If , during calculations elements of wgt will be transformed as described in Section [Description]. Before exit the inverse transformation will be applied. As a result there may be slight differences between the input wgt and the output wgt.
|
|
If , the th observation is not included in the analysis.
|
|
If , wgt is not referenced.
|
|
On exit: unchanged, except as described above.
|
|
|
theta - Vector(1..m, datatype=float[8]);
|
|
|
On exit: the M-estimate of , for .
|
|
|
k - assignable;
|
|
|
Note: On exit the variable k will have a value of type integer.
|
|
On exit: the column rank of the matrix .
|
|
|
sigma - assignable;
|
|
|
Note: On exit the variable sigma will have a value of type float.
|
|
On entry: a starting value for the estimation of . sigma should be approximately the standard deviation of the residuals from the model evaluated at the value of given by theta on entry.
|
|
Constraint: . .
|
|
|
rs - Vector(1..n, datatype=float[8]);
|
|
|
On exit: the residuals from the model evaluated at final value of theta, i.e., rs contains the vector .
|
|
|
tol - float;
|
|
|
On entry: the relative precision for the final estimates. Convergence is assumed when both the relative change in the value of sigma and the relative change in the value of each element of theta are less than tol.
|
|
It is advisable for tol to be greater than .
|
|
Constraint: . .
|
|
|
eps - float;
|
|
|
On entry: a relative tolerance to be used to determine the rank of .
|
|
If or then machine precision will be used in place of tol.
|
|
A reasonable value for eps is where this value is possible.
|
|
|
maxit - integer;
|
|
|
On entry: the maximum number of iterations that should be used during the estimation.
|
|
A value of should be adequate for most uses.
|
|
Constraint: . .
|
|
|
nitmon - integer;
|
|
|
On entry: determines the amount of information that is printed on each iteration.
|
|
No information is printed.
|
|
On the first and every nitmon iterations the values of sigma, theta and the change in theta during the iteration are printed.
|
|
|
nit - assignable;
|
|
|
Note: On exit the variable nit will have a value of type integer.
|
|
On exit: the number of iterations that were used during the estimation.
|
|
|
'n'=n - integer; (optional)
|
|
|
On entry: , the number of observations.
|
|
Constraint: . .
|
|
|
'm'=m - integer; (optional)
|
|
|
On entry: , the number of independent variables.
|
|
Constraint: . .
|
|
|
'outfile'=outfile - character; (optional)
|
|
|
On entry: The name of a file to which intermediate or diagnostic output should be appended. If a value is not provided for this parameter then the behaviour of this routine is platform dependent. Usually all output will be suppressed, however on some platforms output will be produced and will be displayed in the Maple session.
|
|
|
'comm'=comm - table; (optional)
|
|
|
A Maple table, which should be generated using NAG[Nag_Comm], corresponding to the Nag_Comm structure.
|
|
|
'fail'=fail - table; (optional)
|
|
|
The NAG error argument, see the documentation for NagError.
|
|
|
|
Description
|
|
|
Purpose
|
|
nag_robust_m_regsn_user_fn (g02hdc) performs bounded influence regression (-estimates) using an iterative weighted least-squares algorithm.
|
|
Description
|
|
For the linear regression model
|
where is a vector of length of the dependent variable,
|
|
is a vector of length of unknown arguments,
|
nag_robust_m_regsn_user_fn (g02hdc) calculates the M-estimates given by the solution, , to the equation
(1)
|
is a suitable weight function,
|
|
and may be estimated at each iteration by the median absolute deviation of the residuals
|
or as the solution to
for a suitable weight function , where and are constants, chosen so that the estimator of is asymptotically unbiased if the errors, , have a Normal distribution. Alternatively may be held at a constant value.
The above describes the Schweppe type regression. If the are assumed to equal 1 for all , then Huber type regression is obtained. A third type, due to Mallows, replaces (1) by
This may be obtained by use of the transformations
(see Marazzi (1987b)).
The calculation of the estimates of can be formulated as an iterative weighted least-squares problem with a diagonal weight matrix given by
The value of at each iteration is given by the weighted least-squares regression of on . This is carried out by first transforming the and by
and then using a least squares solver. If is of full column rank then an orthogonal-triangular (QR) decomposition is used; if not, a singular value decomposition is used.
Observations with zero or negative weights are not included in the solution.
Note: there is no explicit provision in the function for a constant term in the regression model. However, the addition of a dummy variable whose value is for all observations will produce a value of corresponding to the usual constant term.
nag_robust_m_regsn_user_fn (g02hdc) is based on routines in ROBETH, see Marazzi (1987b).
|
|
Error Indicators and Warnings
|
|
"NE_ALLOC_FAIL"
Dynamic memory allocation failed.
"NE_BAD_PARAM"
On entry, argument had an illegal value.
"NE_CHI"
Value given by chi function : .
"NE_CONVERGENCE_SOL"
Iterations to solve weighted least squares equations failed to converge.
"NE_CONVERGENCE_THETA"
Iterations to calculate estimates of theta failed to converge in maxit iterations: .
"NE_FULL_RANK"
Weighted least squares equations not of full rank: rank .
"NE_INT"
On entry, . Constraint: .
On entry, . Constraint: .
On entry, . Constraint: .
"NE_INT_2"
On entry, , . Constraint: .
On entry, : , .
"NE_INTERNAL_ERROR"
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please consult NAG for assistance.
"NE_NOT_CLOSE_FILE"
Cannot close file .
"NE_NOT_WRITE_FILE"
Cannot open file for writing.
"NE_REAL"
On entry, . Constraint: .
On entry, . Constraint: .
On entry, . Constraint: .
"NE_ZERO_DF"
Value of : , .
"NE_ZERO_VALUE"
Estimated value of sigma is zero.
|
|
Accuracy
|
|
The accuracy of the results is controlled by tol.
|
|
Further Comments
|
|
In cases when it is important for the value of sigma to be of a reasonable magnitude. Too small a value may cause too many of the winsorized residuals, i.e., , to be zero, which will lead to convergence problems and may trigger the the error "NE_FULL_RANK" is raised error.
By suitable choice of the functions chi and psi this function may be used for other applications of iterative weighted least-squares.
For the variance-covariance matrix of see g02hfc (nag_robust_m_regsn_param_var).
|
|
|
Examples
|
|
>
|
chi := proc(t, comm)
local ps:
if (abs(t) < 1.5) then
ps := t:
else
ps := 1.5:
end if:
return (ps^2/2.0):
end proc:
psi := proc(t, comm)
if (t <= -1.5) then
-1.5:
elif (t >= 1.5) then
1.5:
else
t:
end if:
end proc:
psip0 := 1:
beta := 0.1443849979905463:
regtype := "Nag_SchweppeReg":
sigma_est := "Nag_SigmaChi":
n := 5:
m := 3:
sigma := 1:
tol := 5e-05:
eps := 5e-06:
maxit := 50:
nitmon := 0:
comm := NAG:-Nag_Comm():
x := Matrix([[1, -1, -1], [1, -1, 1], [1, 1, -1], [1, 1, 1], [1, 0, 3]], datatype=float[8]):
y := Vector([10.5, 11.3, 12.6, 13.4, 17.1], datatype=float[8]):
wgt := Vector([0.4039, 0.5012, 0.4039, 0.5012, 0.3862], datatype=float[8]):
theta := Vector([0, 0, 0], datatype=float[8]):
rs := Vector(5, datatype=float[8]):
NAG:-g02hdc(chi, psi, psip0, beta, regtype, sigma_est, x, y, wgt, theta, k, sigma, rs, tol, eps, maxit, nitmon, nit, 'n' = n, 'm' = m, 'comm' = comm):
|
|
|
See Also
|
|
Hampel F R, Ronchetti E M, Rousseeuw P J and Stahel W A (1986) Robust Statistics. The Approach Based on Influence Functions Wiley
Huber P J (1981) Robust Statistics Wiley
Marazzi A (1987b) Subroutines for robust and bounded influence regression in ROBETH Cah. Rech. Doc. IUMSP, No. 3 ROB 2 Institut Universitaire de Médecine Sociale et Préventive, Lausanne
g02 Chapter Introduction.
NAG Toolbox Overview.
NAG Web Site.
|
|