Home

/

CRAN

/

eimpute

/

eimpute: Efficiently IMPUTE Large Scale Incomplete Matrix

eimpute: Efficiently IMPUTE Large Scale Incomplete Matrix
In eimpute: Efficiently Impute Large Scale Incomplete Matrix

knitr::opts_chunk$set(comment = "#>", warning = FALSE, eval = TRUE, message = FALSE, collapse = TRUE)
library(eimpute)

Introduction

Matrix completion is a procedure for imputing the missing elements in matrices by using the information of observed elements. This procedure can be visualized as:

Matrix completion has attracted a lot of attention, it is widely applied in:

tabular data imputation: recover the missing elements in data table;
recommend system: estimate users' potantial preference for items pending purchased;
image inpainting: inpaint the missing elements in digit images.

A computationally efficient R package, eimpute is developed for matrix completion. In eimpute, matrix completion problem is solved by iteratively performing low-rank approximation and data calibration, which enjoy two admirable advantages:

unbiased low-rank approximation for incomplete matrix
less time consumption via truncated SVD

Compare eimpute and softimpute in systhesis datasets $X_{m \times m}$ with $p$ proportion missing observations. The square matrix $X_{m \times m}$ is generated by $X = UV + \epsilon$, where $U$ and $V$ are $m \times r$, $r \times n$ matrices whose entries are $i.i.d.$ sampled standard normal distribution, $\epsilon \sim N(0, r/3)$.

$m$ is chosen as 1000, 2000, 3000, 4000
$p$ is chosen as 0.1, 0.5, 0.9.

In high dimension case, als method in softimpute is a little faster than eimpute in low proportion of missing observations, as the proportion of missing observations increase, rsvd method in eimpute have a better performance than softimpute in time cost and test error. Compare with two method in *eimpute, rsvd method is better than tsvd in time cost.

Installation

Install the stable version from CRAN:

install.packages("eimpute")

Install the development version from github:

library(devtools)
install_github("Mamba413/eimpute", build_vignettes = TRUE)

Quick Example

We start with a toy example. Let us generate a small matrix with some values missing via incomplete.generator function.

m <- 6
n <- 5
r <- 3
x_na <- incomplete.generator(m, n, r)
x_na

Use eimpute function to impute missing values.

x_impute <- eimpute(x_na, r)
x_impute[["x.imp"]]

Any scripts or data that you put into this service are public.

eimpute documentation built on Sept. 11, 2024, 7:58 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.