|
NAG[g03cac] NAG[nag_mv_factor] - Maximum likelihood estimates of parameters
|
|
Calling Sequence
g03cac(matrix_type, x, isx, e, stat, com, psi, res, fl, eps, 'n'=n, 'm'=m, 'tdx'=tdx, 'nvar'=nvar, 'nfac'=nfac, 'wt'=wt, 'tdfl'=tdfl, 'optional_settings'=optional_settings, 'fail'=fail)
nag_mv_factor(. . .)
Parameters
|
matrix_type - String;
|
|
|
On entry: selects the type of matrix on which factor analysis is to be performed.
|
|
(Data input) The data matrix will be input in x and factor analysis will be computed for the correlation matrix.
|
|
The data matrix will be input in x and factor analysis will be computed for the covariance matrix, i.e., the results are scaled as described in Section [Further Comments].
|
|
The correlation/variance-covariance matrix will be input in x and factor analysis computed for this matrix.
|
|
Constraint: "Nag_DataCorr", "Nag_DataCovar" or "Nag_MatCorr_Covar". .
|
|
|
x - Matrix(1.., 1..tdx, datatype=float[8], order=C_order);
|
|
|
On entry: the input matrix.
|
|
x must contain the correlation or variance-covariance matrix. Only the upper triangular part is required.
|
|
|
isx - Vector(1..m, datatype=integer[kernelopts('wordsize')/8]);
|
|
|
On entry: indicates whether or not the th variable is to be included in the factor analysis.
|
|
Constraint: for nvar values of .
|
|
|
e - Vector(1..nvar, datatype=float[8]);
|
|
|
On exit: the eigenvalues , for .
|
|
|
stat - Vector(1.., datatype=float[8]);
|
|
|
On exit: the test statistics.
|
|
contains the value .
|
|
contains the test statistic, .
|
|
contains the degrees of freedom associated with the test statistic.
|
|
contains the significance level.
|
|
|
com - Vector(1..nvar, datatype=float[8]);
|
|
|
On exit: the communalities.
|
|
|
psi - Vector(1..nvar, datatype=float[8]);
|
|
|
On exit: the estimates of , for .
|
|
|
res - Vector(1.., datatype=float[8]);
|
|
|
Note: the dimension, dim, of the array res must be at least .
|
|
|
fl - Matrix(1..nvar, 1..tdfl, datatype=float[8], order=C_order);
|
|
|
|
eps - float;
|
|
|
On entry: a lower bound for the value of .
|
|
Constraint: . .
|
|
|
'n'=n - integer; (optional)
|
|
|
Default value: the first dimension of the array wt.
|
|
On entry: if "Nag_DataCorr" or "Nag_DataCovar" the number of observations in the data array x.
|
|
If the (effective) number of observations used in computing the (possibly weighted) correlation/variance-covariance matrix input in x.
|
|
Constraint: . .
|
|
|
'm'=m - integer; (optional)
|
|
|
Default value: the first dimension of the array isx and the second dimension of the array isxthe array x.
|
|
On entry: the number of variables in the data/correlation/variance-covariance matrix.
|
|
Constraint: . .
|
|
|
'tdx'=tdx - integer; (optional)
|
|
|
On entry: the second dimension of the array x as declared in the function from which nag_mv_factor (g03cac) is called.
|
|
Constraint: . .
|
|
|
'nvar'=nvar - integer; (optional)
|
|
|
Default value: the first dimension of the arrays e, com, psi, fl.
|
|
On entry: the number of variables in the factor analysis, .
|
|
Constraint: . .
|
|
|
'nfac'=nfac - integer; (optional)
|
|
|
Default value: the second dimension of the array fl.
|
|
On entry: the number of factors, .
|
|
Constraint: . .
|
|
|
'wt'=wt - Vector(1..n, datatype=float[8]); (optional)
|
|
|
If or wt is set to the null pointer NULL, i.e., (double *)0, then wt is not referenced and the effective number of observations is .
|
|
|
'tdfl'=tdfl - integer; (optional)
|
|
|
On entry: the second dimension of the array fl as declared in the function from which nag_mv_factor (g03cac) is called.
|
|
Constraint: . .
|
|
|
'optional_settings'=optional_settings - Vector; (optional)
|
|
|
|
'fail'=fail - table; (optional)
|
|
|
The NAG error argument, see the documentation for NagError.
|
|
|
|
Description
|
|
|
Purpose
|
|
nag_mv_factor (g03cac) computes the maximum likelihood estimates of the arguments of a factor analysis model. Either the data matrix or a correlation/covariance matrix may be input. Factor loadings, communalities and residual correlations are returned.
|
|
Description
|
|
Let variables, , with variance-covariance matrix be observed. The aim of factor analysis is to account for the covariances in these variables in terms of a smaller number, , of hypothetical variables, or factors, . These are assumed to be independent and to have unit variance. The relationship between the observed variables and the factors is given by the model:
where , for ; , are the factor loadings and , for , are independent random variables with variances , for . The represent the unique component of the variation of each observed variable. The proportion of variation for each variable accounted for by the factors is known as the communality. For this function it is assumed that both the factors and the 's follow independent Normal distributions.
The model for the variance-covariance matrix, , can be written as:
(1)
where is the matrix of the factor loadings, , and is a diagonal matrix of unique variances, , for .
The estimation of the arguments of the model, and , by maximum likelihood is described by Lawley and Maxwell (1971). The log likelihood is:
where is the number of observations, is the sample variance-covariance matrix or, if weights are used, is the weighted sample variance-covariance matrix and is the effective number of observations, that is, the sum of the weights. The constant is independent of the arguments of the model. A two stage maximization is employed. It makes use of the function , which is, up to a constant, times the log likelihood maximized over . This is then minimized with respect to to give the estimates, , of . The function can be written as:
where values , for are the eigenvalues of the matrix:
The estimates , of , are then given by scaling the eigenvectors of , which are denoted by :
where is the diagonal matrix with elements , and is the identity matrix.
The minimization of is performed using e04lbc (nag_opt_bounds_2nd_deriv) which uses a modified Newton algorithm. The computation of the Hessian matrix is described by Clark (1970). However, instead of using the eigenvalue decomposition of the matrix as described above, the singular value decomposition of the matrix is used, where is obtained either from the decomposition of the (scaled) mean-centred data matrix or from the Cholesky decomposition of the correlation/covariance matrix. The function e04lbc (nag_opt_bounds_2nd_deriv) ensures that the values of are greater than a given small positive quantity, , so that the communality is always less than one. This avoids the so called Heywood cases.
In addition to the values of , and the communalities, nag_mv_factor (g03cac) returns the residual correlations, i.e., the off-diagonal elements of where is the sample correlation matrix. nag_mv_factor (g03cac) also returns the test statistic:
which can be used to test the goodness of fit of the model (1), see Lawley and Maxwell (1971) and Morrison (1967).
|
|
Error Indicators and Warnings
|
|
"NE_2_INT_ARG_GT"
On entry, while . These arguments must satisfy .
"NE_2_INT_ARG_LE"
On entry, while . These arguments must satisfy .
"NE_2_INT_ARG_LT"
On entry, while . These arguments must satisfy .
"NE_ALLOC_FAIL"
Dynamic memory allocation failed.
"NE_BAD_PARAM"
On entry, argument matrix_type had an illegal value.
"NE_INT_ARG_LT"
On entry, nfac must not be less than 1: .
"NE_INTERNAL_ERROR"
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please consult NAG for assistance.
"NE_INVALID_REAL_RANGE_EF"
Value given to eps is not valid.Correct range is machine precision .
"NE_NEG_WEIGHT_ELEMENT"
On entry, . Constraint: When referenced, all elements of wt must be non-negative.
"NE_NOT_APPEND_FILE"
Cannot open file for appending.
"NE_NOT_CLOSE_FILE"
Cannot close file .
"NE_OBSERV_LT_VAR"
With weighted data, the effective number of observations given by the sum of weights , while the number of variables included in the analysis, . Constraint: effective number of observations .
"NE_SVD_NOT_CONV"
A singular value decomposition has failed to converge. This is a very unlikely error exit.
"NE_VAR_INCL_INDICATED"
The number of variables, nvar in the analysis , while number of variables included in the analysis via array . Constraint: these two numbers must be the same.
"NW_COND_MIN"
The conditions for a minimum have not all been satisfied but a lower point could not be found. Note that in this case all the results are computed. See e04lbc (nag_opt_bounds_2nd_deriv) for further details.
"NW_TOO_MANY_ITER"
The maximum number of iterations, , have been performed.
|
|
Further Comments
|
|
The factor loadings may be orthogonally rotated by using g03bac (nag_mv_orthomax) and factor score coefficients can be computed using g03ccc (nag_mv_fac_score). The maximum likelihood estimators are invariant to a change in scale. This means that the results obtained will be the same (up to a scaling factor) if either the correlation matrix or the variance-covariance matrix is used. As the correlation matrix ensures that all values of are between 0 and 1 it will lead to a more efficient optimization. In the situation when the data matrix is input the results are always computed for the correlation matrix and then scaled if the results for the covariance matrix are required. When the user inputs the covariance/correlation matrix the input matrix itself is used and so the user is advised to input the correlation matrix rather than the covariance matrix.
|
|
|
Examples
|
|
>
|
matrix_type := "Nag_MatCorr_Covar":
n := 211:
m := 9:
tdx := 9:
nvar := 9:
nfac := 3:
tdfl := 3:
optional_settings := NAG:-Nag_E04_Opt():
NAG:-SetOptions(optional_settings, max_iter=500, optim_tol=1e-2):
eps := 1e-05:
x := Matrix([[1, 0.523, 0.395, 0.471, 0.346, 0.426, 0.576, 0.434, 0.639], [0.523, 1, 0.479, 0.506, 0.418, 0.462, 0.547, 0.283, 0.645], [0.395, 0.479, 1, 0.355, 0.27, 0.254, 0.452, 0.219, 0.504], [0.471, 0.506, 0.355, 1, 0.6909999999999999, 0.791, 0.443, 0.285, 0.505], [0.346, 0.418, 0.27, 0.6909999999999999, 1, 0.679, 0.383, 0.149, 0.409], [0.426, 0.462, 0.254, 0.791, 0.679, 1, 0.372, 0.314, 0.472], [0.576, 0.547, 0.452, 0.443, 0.383, 0.372, 1, 0.385, 0.68], [0.434, 0.283, 0.219, 0.285, 0.149, 0.314, 0.385, 1, 0.47], [0.639, 0.645, 0.504, 0.505, 0.409, 0.472, 0.68, 0.47, 1]], datatype=float[8], order='C_order'):
isx := Vector([1, 1, 1, 1, 1, 1, 1, 1, 1], datatype=integer[kernelopts('wordsize')/8]):
wt := Vector([], datatype=float[8]):
e := Vector(9, datatype=float[8]):
stat := Vector(4, datatype=float[8]):
com := Vector(9, datatype=float[8]):
psi := Vector(9, datatype=float[8]):
res := Vector(36, datatype=float[8]):
fl := Matrix(9, 3, datatype=float[8], order='C_order'):
NAG:-g03cac(matrix_type, x, isx, e, stat, com, psi, res, fl, eps, 'n' = n, 'm' = m, 'tdx' = tdx, 'nvar' = nvar, 'nfac' = nfac, 'wt' = wt, 'tdfl' = tdfl, 'optional_settings' = optional_settings):
|
|
|
See Also
|
|
Clark M R B (1970) A rapidly convergent method for maximum likelihood factor analysis British J. Math. Statist. Psych.
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20 (3) 2–25
Lawley D N and Maxwell A E (1971) Factor Analysis as a Statistical Method (2nd Edition) Butterworths
Morrison D F (1967) Multivariate Statistical Methods McGraw–Hill
g03 Chapter Introduction.
NAG Toolbox Overview.
NAG Web Site.
|
|