supplementaryVariables4PLSCA: Project supplementary variables (columns) for a PLSCA...

View source: R/supplementaryVariables4PLSCA.R

supplementaryVariables4PLSCAR Documentation

Project supplementary variables (columns) for a PLSCA analysis (from tepPLSCA). **Beta Version. Current Version 07/30/2020. **

Description

supplementaryVariables4PLSCA: Projects supplementary variables (columns) for a PLSCA analysis (from tepPLSCA). The variables should be measured on the same observations as the observations used in the original analysis. The original data consisted in 2 matrices (containing non-negative numbers such as count, as in correspondece analysis, or often simply 0/1 as in multiple correspondence analysis) denoted X (dimensions N by I) and Y (N by J). The supplementary data denoted Vsup is a N by K matrix, that can be considered as originating either from X (and then denoted Xsup) or Y (and then denoted Ysup) . If originating from X (resp, Y) matrix Y (resp, X) is the dual matrix. Note that only the dual matrix is needed to project supplementary variables. See details for more.

Usage

supplementaryVariables4PLSCA(
  var.sup,
  make.var.sup.nominal = TRUE,
  resPLSCA,
  Xset = NULL,
  make.Xset.nominal = TRUE,
  Yset = NULL,
  make.Yset.nominal = TRUE,
  dimNames = "Dimension "
)

Arguments

var.sup

Vsup: The N by K matrix of K supplementary variables.

make.var.sup.nominal

logical, when TRUE (default) transforms each column of var.sup from a factor with M levels to a set of M 0/1 vectors (to create a group coding, also called complete disjonctive coding).

resPLSCA

the results of a PLSCA analysis performed with tepPLSCA.

Xset

the original X (N by I) data matrix. If NULL, the supplementary data are projected on the dual set (i.e., Y). See also details for more.

make.Xset.nominal

logical, when TRUE (default) transforms each column of Xset from a factor with M levels to a set of M 0/1 vectors (to create a group coding, also called complete disjonctive coding).

Yset

the original Y (N by J) data matrix. If NULL, the supplementary data are projected on the dual set (i.e., X). See also details for more.

make.Yset.nominal

logical, when TRUE (default) transforms each column of Yset from a factor with M levels to a set of M 0/1 vectors (to create a group coding, also called complete disjonctive coding).

dimNames

Names for the dimensions (i.e., factors) for the supplementary loadings (Default: 'Dimension ').

Details

The computation relies on the Generalized singular decomposition (GSVD) of the contingency between X and Y, computed as R = X'Y (where X Y are the original data matrices that have been preprocessed as for the original PLSCA analysis, e.g., transformed into 0/1 vectors) and decomposed with the GSVD as R = PDQ', with the (metrics) constraints that P'inv(Dr) P = Q'inv(Dc)Q = I where inv() denotes the inverse matrix and where Dr (resp Dc) are the diagonal matrices of the barycenters of (respectively) X and Y.

Transition formulas

The loadings of one set can be obtained from the cross-product matrix R and the loadings from the dual set. For example: if we denote Delta the diagonal matrix of the singular values, F (resp. G) the singular value normalized factor scores (denoted fi, resp. fj, in PLSCA), and L (resp. C) the row (resp. column) profiles the loadings of one set are derived from the other set as:

F = LG inv(Delta) and G = CF inv(Delta) (with inv(Dc) being the inverse of Dc). Eq. 1

Projection of supplementary variables

Supplementary variable loadings are obtained by first computing the cross-product matrix with their dual set and then using the transition formulas from correspondence analysis to compute one set of loadings from the loadings of the other set. So, for example the loadings denoted fii for the variables of the Xset can be obtained from the row profiles of the Rsup matrix by replacing in Eq.1 L by Lsup.

Value

a list with the following elements:

  • "loadings.sup.X": The loadings of the supplementary variables as originating from the Xset (needs to have the dual Yset to be computed).

  • "sup.fi": The singular-value-normalized loadings of the supplementary variables as originating from the Xset (needs to have the dual Yset to be computed).

  • "loadings.sup.Y": The loadings of the supplementary variables as originating from the Yset (needs to have the dual Xset to be computed).

  • "sup.fj": The singular-value-normalized loadings of the supplementary variables as originating from the Yset (needs to have the dual Xset to be computed).

  • "cor.lx": The correlations between the supplementary variables and the X set.

  • "cor.ly": The correlations between the supplementary variables and the Y set.

Author(s)

Hervé Abdi

References

See:

Beaton, D., Dunlop, J., ADNI, & Abdi, H. (2016). Partial Least Squares-Correspondence Analysis (PLSCA): A framework to simultaneously analyze behavioral and genetic data. Psychological Methods, 21, 621-651.

Abdi H. & Béra, M. (2018). Correspondence analysis. In R. Alhajj and J. Rokne (Eds.), Encyclopedia of Social Networks and Mining (2nd Edition). New York: Springer Verlag.

Abdi, H. (2007). Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD). In N.J. Salkind (Ed.): Encyclopedia of Measurement and Statistics. Thousand Oaks (CA): Sage. pp. 907-912.

See Also

tepPLSCA makeNominalData makeRowProfiles supplementaryObservations4PLSC

Examples

## Not run: 
if(interactive()){
 #EXAMPLE1
 }

## End(Not run)

HerveAbdi/data4PCCAR documentation built on Sept. 11, 2022, 4:19 p.m.