Description Usage Arguments Details Value Author(s) References See Also Examples
Given a matrix with missing values, impute the missing entries using a low-rank SVD approximation estimated by the EM algorithm.
1 | impute.svd(x, k = min(n, p), tol = max(n, p) * 1e-10, maxiter = 100)
|
x |
a matrix to impute the missing entries of. |
k |
the rank of the SVD approximation. |
tol |
the convergence tolerance for the EM algorithm. |
maxiter |
the maximum number of EM steps to take. |
Impute the missing values of x
as follows: First, initialize
all NA
values to the column means, or 0
if all entries
in the column are missing. Then, until convergence, compute the
first k
terms of the SVD of the completed matrix. Replace the
previously missing values with their approximations from the SVD, and
compute the RSS between the non-missing values and the SVD.
Declare convergence if
abs(rss0 - rss1) / (.Machine$double.eps + rss1) < tol
,
where rss0
and rss1
are the RSS values computed from
successive iterations. Stop early after maxiter
iterations
and issue a warning.
x |
the completed version of the matrix. |
rss |
the sum of squares between the SVD approximation and the
non-missing values in |
iter |
the number of EM iterations before algorithm stopped. |
Patrick O. Perry
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D. and Altman, R.B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | # Generate a matrix with missing entries
n <- 20
p <- 10
u <- rnorm( n )
v <- rnorm( p )
xfull <- u %*% rbind( v ) + rnorm( n*p )
miss <- sample( seq_len( n*p ), n )
x <- xfull
x[miss] <- NA
# impute the missing entries with a rank-1 SVD approximation
xhat <- impute.svd( x, 1 )$x
# compute the prediction error for the missing entries
sum( ( xfull-xhat )^2 )
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.