|
NAG[g03adc] NAG[nag_mv_canon_corr] - Canonical correlation analysis
|
|
Calling Sequence
g03adc(z, isz, e, ncv, cvx, cvy, tol, 'n'=n, 'm'=m, 'tdz'=tdz, 'nx'=nx, 'ny'=ny, 'wt'=wt, 'tde'=tde, 'tdcvx'=tdcvx, 'tdcvy'=tdcvy, 'fail'=fail)
nag_mv_canon_corr(. . .)
Parameters
|
z - Matrix(1..n, 1..tdz, datatype=float[8], order=C_order);
|
|
|
|
isz - Vector(1..m, datatype=integer[kernelopts('wordsize')/8]);
|
|
|
On entry: indicates whether or not the th variable is to be included in the analysis and to which set of variables it belongs.
|
|
If , then the variable contained in the th column of z is not included in the analysis.
|
|
Constraint: only nx elements of isz can be and only ny elements of isz can be .
|
|
|
e - Matrix(1.., 1..tde, datatype=float[8], order=C_order);
|
|
|
Note: the dimension, dim, of the array e must be at least .
|
|
|
ncv - assignable;
|
|
|
Note: On exit the variable ncv will have a value of type integer.
|
|
|
cvx - Matrix(1..nx, 1..tdcvx, datatype=float[8], order=C_order);
|
|
|
|
cvy - Matrix(1..ny, 1..tdcvy, datatype=float[8], order=C_order);
|
|
|
|
tol - float;
|
|
|
On entry: the value of tol is used to decide if the variables are of full rank and, if not, what is the rank of the variables. The smaller the value of tol the stricter the criterion for selecting the singular value decomposition. If a non-negative value of tol less than machine precision is entered, then the square root of machine precision is used instead.
|
|
Constraint: . .
|
|
|
'n'=n - integer; (optional)
|
|
|
Default value: the first dimension of the arrays z, wt.
|
|
On entry: the number of observations, .
|
|
Constraint: . .
|
|
|
'm'=m - integer; (optional)
|
|
|
Default value: the first dimension of the array isz and the second dimension of the array iszthe array z.
|
|
On entry: the total number of variables, .
|
|
Constraint: . .
|
|
|
'tdz'=tdz - integer; (optional)
|
|
|
On entry: the second dimension of the array z as declared in the function from which nag_mv_canon_corr (g03adc) is called.
|
|
Constraint: . .
|
|
|
'nx'=nx - integer; (optional)
|
|
|
Default value: the first dimension of the array cvx.
|
|
On entry: the number of variables in the analysis, .
|
|
Constraint: . .
|
|
|
'ny'=ny - integer; (optional)
|
|
|
Default value: the first dimension of the array cvy.
|
|
On entry: the number of variables in the analysis, .
|
|
Constraint: . .
|
|
|
'wt'=wt - Vector(1..n, datatype=float[8]); (optional)
|
|
|
On entry: the elements of wt must contain the weights to be used in the analysis. The effective number of observations is the sum of the weights. If then the th observation is not included in the analysis.
|
|
, for ;
|
|
Note: if wt is set to the null pointer NULL, i.e., (double *)0, then wt is not referenced and the effective number of observations is .
|
|
|
'tde'=tde - integer; (optional)
|
|
|
On entry: the second dimension of the array e as declared in the function from which nag_mv_canon_corr (g03adc) is called.
|
|
Constraint: . .
|
|
|
'tdcvx'=tdcvx - integer; (optional)
|
|
|
On entry: the second dimension of the array cvx as declared in the function from which nag_mv_canon_corr (g03adc) is called.
|
|
Constraint: (nx,ny). .
|
|
|
'tdcvy'=tdcvy - integer; (optional)
|
|
|
On entry: the second dimension of the array cvy as declared in the function from which nag_mv_canon_corr (g03adc) is called.
|
|
Constraint: (nx,ny). .
|
|
|
'fail'=fail - table; (optional)
|
|
|
The NAG error argument, see the documentation for NagError.
|
|
|
|
Description
|
|
|
Purpose
|
|
nag_mv_canon_corr (g03adc) performs canonical correlation analysis upon input data matrices.
|
|
Description
|
|
Let there be two sets of variables, and . For a sample of observations on variables in a data matrix and variables in a data matrix , canonical correlation analysis seeks to find a small number of linear combinations of each set of variables in order to explain or summarise the relationships between them. The variables thus formed are known as canonical variates.
Let the variance-covariance matrix of the two data sets be
and let
then the canonical correlations can be calculated from the eigenvalues of the matrix . However, nag_mv_canon_corr (g03adc) calculates the canonical correlations by means of a singular value decomposition (SVD) of a matrix . If the rank of the data matrix is and the rank of the data matrix is , and both and have had variable (column) means subtracted, then the by matrix is given by:
where is the first rows of the orthogonal matrix either from the decomposition of if is of full column rank, i.e., :
or from the SVD of if :
Similarly is the first rows of the orthogonal matrix either from the decomposition of if is of full column rank, i.e., :
or from the SVD of if :
Let the SVD of be:
then the non-zero elements of the diagonal matrix , , for , are the canonical correlations associated with the canonical variates, where .
The eigenvalues, , of the matrix are given by:
The value of gives the proportion of variation explained by the th canonical variate. The values of the give an indication as to how many canonical variates are needed to adequately describe the data, i.e., the dimensionality of the problem.
To test for a significant dimensionality greater than the statistic:
can be used. This is asymptotically distributed as a distribution with degrees of freedom. If the test for is not significant, then the remaining tests for should be ignored.
The loadings for the canonical variates are calculated from the matrices and respectively. These matrices are scaled so that the canonical variates have unit variance.
|
|
Error Indicators and Warnings
|
|
"NE_3_INT_ARG_CONS"
On entry, , and . These arguments must satisfy .
"NE_ALLOC_FAIL"
Dynamic memory allocation failed.
"NE_CANON_CORR_1"
A canonical correlation is equal to one. This will happen if the and variables are perfectly correlated.
"NE_INT_ARG_LT"
On entry, nx must not be less than 1: .
"NE_INTERNAL_ERROR"
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please consult NAG for assistance.
"NE_MAT_RANK_ZERO"
The rank of the matrix or the rank of the matrix is zero. This will happen if all the and variables are constants.
"NE_NEG_WEIGHT_ELEMENT"
On entry, . Constraint: When referenced, all elements of wt must be non-negative.
"NE_OBSERV_LT_VAR"
With weighted data, the effective number of observations given by the sum of weights , while number of variables included in the analysis, . Constraint: Effective number of observations .
"NE_REAL_ARG_LT"
On entry, tol must not be less than : .
"NE_SVD_NOT_CONV"
The singular value decomposition has failed to converge. This is an unlikely error exit.
"NE_VAR_INCL_INDICATED"
The number of variables, nx in the analysis , while the number of variables included in the analysis via array . Constraint: these two numbers must be the same. The number of variables, ny in the analysis , while the number of variables included in the analysis via array . Constraint: these two numbers must be the same.
|
|
Accuracy
|
|
As the computation involves the use of orthogonal matrices and a singular value decomposition rather than the traditional computing of a sum of squares matrix and the use of an eigenvalue decomposition, nag_mv_canon_corr (g03adc) should be less affected by ill conditioned problems.
|
|
|
Examples
|
|
>
|
n := 9:
m := 4:
tdz := 4:
nx := 2:
ny := 2:
tde := 6:
tdcvx := 2:
tdcvy := 2:
tol := 1e-06:
z := Matrix([[80, 58.4, 14, 21], [75, 59.2, 15, 27], [78, 60.3, 15, 27], [75, 57.4, 13, 22], [79, 59.5, 14, 26], [78, 58.1, 14.5, 26], [75, 58, 12.5, 23], [64, 55.5, 11, 22], [80, 59.2, 12.5, 22]], datatype=float[8], order='C_order'):
isz := Vector([-1, 1, 1, -1], datatype=integer[kernelopts('wordsize')/8]):
wt := Vector([], datatype=float[8]):
e := Matrix(2, 6, datatype=float[8], order='C_order'):
cvx := Matrix(2, 2, datatype=float[8], order='C_order'):
cvy := Matrix(2, 2, datatype=float[8], order='C_order'):
NAG:-g03adc(z, isz, e, ncv, cvx, cvy, tol, 'n' = n, 'm' = m, 'tdz' = tdz, 'nx' = nx, 'ny' = ny, 'wt' = wt, 'tde' = tde, 'tdcvx' = tdcvx, 'tdcvy' = tdcvy):
|
|
|
See Also
|
|
Chatfield C and Collins A J (1980) Introduction to Multivariate Analysis Chapman and Hall
Kendall M G and Stuart A (1976) The Advanced Theory of Statistics (Volume 3) (3rd Edition) Griffin
Morrison D F (1967) Multivariate Statistical Methods McGraw–Hill
g03 Chapter Introduction.
NAG Toolbox Overview.
NAG Web Site.
|
|