softImpute | R Documentation |
fit a low-rank matrix approximation to a matrix with missing values via nuclear-norm regularization. The algorithm works like EM, filling in the missing values with the current guess, and then solving the optimization problem on the complete matrix using a soft-thresholded SVD. Special sparse-matrix classes available for very large matrices.
softImpute(
x,
rank.max = 2,
lambda = 0,
type = c("als", "svd"),
thresh = 1e-05,
maxit = 100,
trace.it = FALSE,
warm.start = NULL,
final.svd = TRUE
)
x |
An m by n matrix with NAs. For large matrices can be of class
|
rank.max |
This restricts the rank of the solution. If sufficiently
large, and with |
lambda |
nuclear-norm regularization parameter. If |
type |
two algorithms are implements, |
thresh |
convergence threshold, measured as the relative change in the Frobenius norm between two successive estimates. |
maxit |
maximum number of iterations. |
trace.it |
with |
warm.start |
an svd object can be supplied as a warm start. This is
particularly useful when constructing a path of solutions with decreasing
values of |
final.svd |
only applicable to |
SoftImpute solves the following problem for a matrix X
with missing
entries:
\min||X-M||_o^2 +\lambda||M||_*.
Here ||\cdot||_o
is
the Frobenius norm, restricted to the entries corresponding to the
non-missing entries of X
, and ||M||_*
is the nuclear norm of
M
(sum of singular values). For full details of the "svd" algorithm
are described in the reference below. The "als" algorithm will be described
in a forthcoming article. Both methods employ special sparse-matrix tricks
for large matrices with many missing values. This package creates a new
sparse-matrix class "SparseplusLowRank"
for matrices of the form
x+ab',
where x
is sparse and a
and b
are tall
skinny matrices, hence ab'
is low rank. Methods for efficient left and
right matrix multiplication are provided for this class. For large matrices,
the function Incomplete()
can be used to build the appropriate sparse
input matrix from market-format data.
An svd object is returned, with components "u", "d", and "v". If
the solution has zeros in "d", the solution is truncated to rank one more
than the number of zeros (so the zero is visible). If the input matrix had
been centered and scaled by biScale
, the scaling details are assigned
as attributes inherited from the input matrix.
Trevor Hastie, Rahul Mazumder
Maintainer: Trevor Hastie
hastie@stanford.edu
Rahul Mazumder, Trevor Hastie and Rob Tibshirani (2010)
Spectral Regularization Algorithms for Learning Large Incomplete
Matrices, https://hastie.su.domains/Papers/mazumder10a.pdf
Journal of Machine Learning Research, 11, 2287-2322
Trevor Hastie, Rahul Mazumder, Jason Lee, Reza Zadeh (2015) Matrix Completion and Low-rank SVD via Fast Alternating Least Squares,
https://arxiv.org/abs/1410.2596
Journal of Machine Learning Research, 16, 3367-3402
biScale
, svd.als
,Incomplete
, lambda0
,
impute
, complete
set.seed(101)
n=200
p=100
J=50
np=n*p
missfrac=0.3
x=matrix(rnorm(n*J),n,J)%*%matrix(rnorm(J*p),J,p)+matrix(rnorm(np),n,p)/5
ix=seq(np)
imiss=sample(ix,np*missfrac,replace=FALSE)
xna=x
xna[imiss]=NA
###uses regular matrix method for matrices with NAs
fit1=softImpute(xna,rank=50,lambda=30)
###uses sparse matrix method for matrices of class "Incomplete"
xnaC=as(xna,"Incomplete")
fit2=softImpute(xnaC,rank=50,lambda=30)
###uses "svd" algorithm
fit3=softImpute(xnaC,rank=50,lambda=30,type="svd")
ximp=complete(xna,fit1)
### first scale xna
xnas=biScale(xna)
fit4=softImpute(xnas,rank=50,lambda=10)
ximp=complete(xna,fit4)
impute(fit4,i=c(1,3,7),j=c(2,5,10))
impute(fit4,i=c(1,3,7),j=c(2,5,10),unscale=FALSE)#ignore scaling and centering
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.