PcaNA | R Documentation |
Computes classical and robust principal components for incomplete data using an EM algorithm as descibed by Serneels and Verdonck (2008)
PcaNA(x, ...)
## Default S3 method:
PcaNA(x, k = ncol(x), kmax = ncol(x), conv=1e-10, maxiter=100,
method=c("cov", "locantore", "hubert", "grid", "proj", "class"), cov.control=NULL,
scale = FALSE, signflip = TRUE, crit.pca.distances = 0.975, trace=FALSE, ...)
## S3 method for class 'formula'
PcaNA(formula, data = NULL, subset, na.action, ...)
formula |
a formula with no response variable, referring only to numeric variables. |
data |
an optional data frame (or similar: see
|
subset |
an optional vector used to select rows (observations) of the
data matrix |
na.action |
a function which indicates what should happen
when the data contain |
... |
arguments passed to or from other methods. |
x |
a numeric matrix (or data frame) which provides the data for the principal components analysis. |
k |
number of principal components to compute. If |
kmax |
maximal number of principal components to compute.
Default is |
conv |
convergence criterion for the EM algorithm.
Default is |
maxiter |
maximal number of iterations for the EM algorithm.
Default is |
method |
which PC method to use (classical or robust) - "class" means classical PCA
and one of the following "locantore", "hubert", "grid", "proj", "cov" specifies a
robust PCA method. If the method is "cov" - i.e. PCA based on a robust covariance matrix -
the argument |
cov.control |
control object in case of robust PCA based on a robust covariance matrix. |
scale |
a logical value indicating whether the variables should be
scaled to have unit variance (only possible if there are no constant
variables). As a scale function |
signflip |
a logical value indicating wheather to try to solve the sign indeterminancy of the loadings -
ad hoc approach setting the maximum element in a singular vector to be positive. Default is |
crit.pca.distances |
criterion to use for computing the cutoff values for the orthogonal and score distances. Default is 0.975. |
trace |
whether to print intermediate results. Default is |
PcaNA
, serving as a constructor for objects of class PcaNA
is a generic function with "formula" and "default" methods. For details see the relevant references.
An S4 object of class PcaNA
which is a subclass of the
virtual class Pca-class
.
Valentin Todorov valentin.todorov@chello.at
Serneels S & Verdonck T (2008), Principal component analysis for data containing outliers and missing elements. Computational Statistics and Data Analisys, 52(3), 1712–1727 .
Todorov V & Filzmoser P (2009), An Object Oriented Framework for Robust Multivariate Analysis. Journal of Statistical Software, 32(3), 1–47. <doi:10.18637/jss.v032.i03>.
## 1. With complete data
## PCA of the bushfire data
data(bushfire)
pca <- PcaNA(bushfire)
pca
## Compare with the classical PCA
prcomp(bushfire)
## or
PcaNA(bushfire, method="class")
## If you want to print the scores too, use
print(pca, print.x=TRUE)
## Using the formula interface
PcaNA(~., data=bushfire)
## To plot the results:
plot(pca) # distance plot
pca2 <- PcaNA(bushfire, k=2)
plot(pca2) # PCA diagnostic plot (or outlier map)
## Use the standard plots available for for prcomp and princomp
screeplot(pca)
biplot(pca)
################################################################
## 2. Now the same wit incomplete data - bush10
data(bush10)
pca <- PcaNA(bush10)
pca
## Compare with the classical PCA
PcaNA(bush10, method="class")
## If you want to print the scores too, use
print(pca, print.x=TRUE)
## Using the formula interface
PcaNA(~., data=as.data.frame(bush10))
## To plot the results:
plot(pca) # distance plot
pca2 <- PcaNA(bush10, k=2)
plot(pca2) # PCA diagnostic plot (or outlier map)
## Use the standard plots available for for prcomp and princomp
screeplot(pca)
biplot(pca)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.