CSCA: Common and Specific Correspondence Analysis (CSCA) of a set...

View source: R/CSCA.R

CSCAR Documentation

Common and Specific Correspondence Analysis (CSCA) of a set of K matched matrices, each of order I*J.


CSCA: implements Common and Specific Correspondence Analysis of a set of K matched matrices, each of order I*J.


CSCA(brickOfMat, nfact = 3, b = NULL)



an I items by J descriptors by K blocks (e.g., matrices) suitable for correspondence analysis (i.e., all non-negative elements).


(Default = 3) number of factors to keep.


(Default = NULL) a K elements weight vector for the matrices (should all be positive and sum to 1). When NULL CSCA computes the weights as the sum of each matrix divided by the grand total. Note that it is general better to have pre-normalized the matrices for CSCA (e.g., with normBrick4PTCA), so that all matrices have the same weight.


The analysis of the three matrices whose results are given in allMatrices.resCA, sumOfMatrices.resCA, and diffOfMatrices implement an ANOVA like decomposition of the Inertia such that All = Sum + Difference or X_K = G + (X_K - G), with X_K being the set of matrices, G being the sum matrix (which in CA is the barycenter of all the matrices) and (X_K - G) being the set of the differences of all the matrices to their barycenter.

The matrix X_K is obtained by stacking all the original matrices on top of each other (so X_K is an I*K by J matrix); the matrix (X_K - G) is obtained by subtracting from each matrix in X_K the matrix G (so X_K - G is an I*K by J matrix); the matrix G is obtained as the sum of all the matrices in X_K (so G is a I by J matrix).

The analysis of the matrix G is made with a standard CA program, but the correspondence analysis of matrices (X_K - G) needs a special CA program because this CA uses the row and column metrics from G, X_K uses the same centers as G, whereas the (X_K - G) matrix is uncentered (these analyses are performed by the function genCA that allow specific metrics and centers). For the analysis of X_K and (X_K - G) the row factor scores (fi) are computed from the plain genCA analysis and the column factor scores (fj) are obtained from partial projection using the correspondence analysis transition formula adapted to blocks of matrices.

Note that in a two table version the partial column factor scores will be identical (and identical to the overall column factor score) and so in this case, the overal column factor scores can be plotted.

Note, also, that the two table version of CSCA could be obtained from the analysis of the [X_1 X_2 || X_2 X_1] circulant matrix (see Greenacre, 2003).


A list with 1) allMatrices.resCA: results for the analysis of the whole set of matrices stacked on top of each other; 2) sumOfMatrices.resCA: results (from ExPosition::epCA) for the analysis of the sum (i.e., average with CA) of all matrices; 3) diffOfMatrices: results for the analysis of the difference of the matrices to their average (from 2); 4) partialProjOnSum: The projection as supplementary elements of the matrices onto their average; and 5) RvCoefficients: the matrix of Rv-coefficient between the matrices. #

allMatrices is a list containing a) fi: the I*K by nfact matrix of the row factor scores; b) fj: the J*nfact*K array by nfact array of the column factor scores; c) Dv: the singular values; d) eigs: the eigenvalues; e) tau: the percentage of Inertia; and f) Inertia the total inertia;

sumOfMatrices is a list storing the output of the plain correspondence analysis of the I*J matrix of the sum of matrices as analyzed by ExPosition::epCA (see help there for more details).

diffOfMatrices is a list containing a) fi: the I*K by nfact matrix of the row factor scores; b) fj: the J*nfact*K array by nfact array of the column factor scores; b) A J*L*K array of partial column factor scores; c) An (I*K) by L matrix of the projection of the original data onto the specific space (useful to explore the difference induced by the original data matrices); d) Dv: the singular values; e) eigs: the eigenvalues; f) tau: the percentage of Inertia; and g) Inertia the total inertia.

partialProjOnSum is a list containing a) fi: the I*K by nfact matrix of the (supplementary) row factor scores; b) fj: the I*nfact*K array by nfact array of the (supplementary) column factor scores.


The ideas used here are derived from:

1) Escofier, B. (1983). Analyse de la différence entre deux mesures définies sur le produit de deux mêmes ensembles. Les Cahiers de l'Analyse des Données, 8, 325-329.

2) Escofier, B., & Drouet, D. (1983). Analyse des différences entre plusieurs tableaux de fréquences. Les Cahiers de l'Analyse des Données, 8, 491-499;

3) Benali, H., & Escofier, B. (1990). Analyse factorielle lissée et analyse factorielle des différences locales Revue de statistique appliquée, 38, 55-76.

4) Greenacre, M. (2003). Singular value decomposition of matched matrices. Journal of Applied Statistics, 30, 1101-1113; and

5) Takane Y. (2014). Constrained Principal Component Analysis and Related Techniques, Boca Raton: CRC Press.

See Also

normBrick4PTCA genPCA



## Not run: 

## End(Not run)

HerveAbdi/PTCA4CATA documentation built on July 17, 2022, 5:41 a.m.