impPCA | R Documentation |
Greedy algorithm for EM-PCA including robust methods
impPCA( x, method = "classical", m = 1, eps = 0.5, k = ncol(x) - 1, maxit = 100, boot = FALSE, verbose = TRUE )
x |
data.frame or matrix |
method |
|
m |
number of multiple imputations (only if parameter |
eps |
threshold for convergence |
k |
number of principal components for reconstruction of |
maxit |
maximum number of iterations |
boot |
residual bootstrap (if |
verbose |
TRUE/FALSE if additional information about the imputation process should be printed |
the imputed data set. If boot = FALSE
this is a data.frame.
If boot = TRUE
this is a list where each list element contains a data.frame.
Matthias Templ
Serneels, Sven and Verdonck, Tim (2008). Principal component analysis for data containing outliers and missing elements. Computational Statistics and Data Analysis, Elsevier, vol. 52(3), pages 1712-1727
Other imputation methods:
hotdeck()
,
irmi()
,
kNN()
,
matchImpute()
,
medianSamp()
,
rangerImpute()
,
regressionImp()
,
sampleCat()
data(Animals, package = "MASS") Animals$brain[19] <- Animals$brain[19] + 0.01 Animals <- log(Animals) colnames(Animals) <- c("log(body)", "log(brain)") Animals_na <- Animals probs <- abs(Animals$`log(body)`^2) probs <- rep(0.5, nrow(Animals)) probs[c(6,16,26)] <- 0 set.seed(1234) Animals_na[sample(1:nrow(Animals), 10, prob = probs), "log(brain)"] <- NA w <- is.na(Animals_na$`log(brain)`) impPCA(Animals_na) impPCA(Animals_na, method = "mcd") impPCA(Animals_na, boot = TRUE, m = 10) impPCA(Animals_na, method = "mcd", boot = TRUE)[[1]] plot(`log(brain)` ~ `log(body)`, data = Animals, type = "n", ylab = "", xlab="") mtext(text = "impPCA robust", side = 3) points(Animals$`log(body)`[!w], Animals$`log(brain)`[!w]) points(Animals$`log(body)`[w], Animals$`log(brain)`[w], col = "grey", pch = 17) imputed <- impPCA(Animals_na, method = "mcd", boot = TRUE)[[1]] colnames(imputed) <- c("log(body)", "log(brain)") points(imputed$`log(body)`[w], imputed$`log(brain)`[w], col = "red", pch = 20, cex = 1.4) segments(x0 = Animals$`log(body)`[w], x1 = imputed$`log(body)`[w], y0 = Animals$`log(brain)`[w], y1 = imputed$`log(brain)`[w], lty = 2, col = "grey") legend("topleft", legend = c("non-missings", "set to missing", "imputed values"), pch = c(1,17,20), col = c("black","grey","red"), cex = 0.7) mape <- round(100* 1/sum(is.na(Animals_na$`log(brain)`)) * sum(abs((Animals$`log(brain)` - imputed$`log(brain)`) / Animals$`log(brain)`)), 2) s2 <- var(Animals$`log(brain)`) nrmse <- round(sqrt(1/sum(is.na(Animals_na$`log(brain)`)) * sum(abs((Animals$`log(brain)` - imputed$`log(brain)`) / s2))), 2) text(x = 8, y = 1.5, labels = paste("MAPE =", mape)) text(x = 8, y = 0.5, labels = paste("NRMSE =", nrmse))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.