PCAmix | R Documentation |
Performs principal component analysis of a set of individuals (observations) described by a mixture of qualitative and quantitative variables. PCAmix includes ordinary principal component analysis (PCA) and multiple correspondence analysis (MCA) as special cases.
PCAmix( X.quanti = NULL, X.quali = NULL, ndim = 5, rename.level = FALSE, weight.col.quanti = NULL, weight.col.quali = NULL, graph = TRUE )
X.quanti |
a numeric matrix of data, or an object that can be coerced to such a matrix (such as a numeric vector or a data frame with all numeric columns). |
X.quali |
a categorical matrix of data, or an object that can be coerced to such a matrix (such as a character vector, a factor or a data frame with all factor columns). |
ndim |
number of dimensions kept in the results (by default 5). |
rename.level |
boolean, if TRUE all the levels of the qualitative variables are renamed as follows: "variable_name=level_name". This prevents to have identical names of the levels. |
weight.col.quanti |
vector of weights for the quantitative variables. |
weight.col.quali |
vector of the weights for the qualitative variables. |
graph |
boolean, if TRUE the following graphics are displayed for the first two dimensions of PCAmix: component map of the individuals, plot of the squared loadings of all the variables (quantitative and qualitative), plot of the correlation circle (if quantitative variables are available), component map of the levels (if qualitative variables are available). |
If X.quali is not specified (i.e. NULL), only quantitative variables are available and standard PCA is performed. If X.quanti is NULL, only qualitative variables are available and standard MCA is performed.
Missing values are replaced by means for quantitative variables and by zeros in the indicator matrix for qualitative variables.
PCAmix performs squared loadings in (sqload
). Squared loadings
for a qualitative variable are correlation ratios between the variable
and the principal components. For a quantitative variable,
squared loadings are the squared correlations between the variable
and the principal components.
Note that when all the p variables are qualitative, the factor coordinates (scores) of the n observations are equal to the factor coordinates (scores) of standard MCA times square root of p and the eigenvalues are then equal to the usual eigenvalues of MCA times p. When all the variables are quantitative, PCAmix gives exactly the same results as standard PCA.
eig |
a matrix containing the eigenvalues, the percentages of variance and the cumulative percentages of variance. |
ind |
a list containing the results for the individuals (observations):
|
quanti |
a list containing the results for the quantitative variables:
|
levels |
a list containing the results for the levels of the qualitative variables:
|
quali |
a list containing the results for the qualitative variables:
|
sqload |
a matrix of dimension ( |
coef |
the coefficients of the linear combinations used to
construct the principal components of PCAmix, and to predict coordinates (scores) of new observations in the function |
M |
the vector of the weights of the columns used in the Generalized Singular Value Decomposition. |
Marie Chavent marie.chavent@u-bordeaux.fr, Amaury Labenne.
Chavent M., Kuentz-Simonet V., Labenne A., Saracco J., Multivariate analysis of mixed data: The PCAmixdata R package, arXiv:1411.4911 [stat.CO].
print.PCAmix
, summary.PCAmix
, predict.PCAmix
, plot.PCAmix
#PCAMIX: data(wine) str(wine) X.quanti <- splitmix(wine)$X.quanti X.quali <- splitmix(wine)$X.quali pca<-PCAmix(X.quanti[,1:27],X.quali,ndim=4) pca<-PCAmix(X.quanti[,1:27],X.quali,ndim=4,graph=FALSE) pca$eig pca$ind$coord #PCA: data(decathlon) quali<-decathlon[,13] pca<-PCAmix(decathlon[,1:10]) pca<-PCAmix(decathlon[,1:10], graph=FALSE) plot(pca,choice="ind",coloring.ind=quali,cex=0.8, posleg="topright",main="Scores") plot(pca, choice="sqload",main="Squared correlations") plot(pca, choice="cor",main="Correlation circle") pca$quanti$coord #MCA data(flower) mca <- PCAmix(X.quali=flower[,1:4], rename.level=TRUE, graph=FALSE) plot(mca,choice="ind", main="Scores") plot(mca,choice="sqload", main="Correlation ratios") plot(mca,choice="levels", main="Levels") mca$levels$coord #Missing values data(vnf) PCAmix(X.quali=vnf,rename.level=TRUE) vnf2<-na.omit(vnf) PCAmix(X.quali=vnf2,rename.level=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.