Fast k-nearest neighbors (K-NN) and principal component analysis (PCA) imputation algorithms for missing values in high-dimensional numeric matrices, i.e., epigenetic data. For extremely high-dimensional data with ordered features, a sliding window approach for K-NN or PCA imputation is provided. Additional features include group-wise imputation (e.g., by chromosome), hyperparameter tuning with repeated cross-validation, multi-core parallelization, and optional subset imputation. The K-NN algorithm is described in: Hastie, T., Tibshirani, R., Sherlock, G., Eisen, M., Brown, P. and Botstein, D. (1999) "Imputing Missing Data for Gene Expression Arrays". The PCA imputation is an optimized version of the imputePCA() function from the 'missMDA' package described in: Josse, J. and Husson, F. (2016) <doi:10.18637/jss.v070.i01> "missMDA: A Package for Handling Missing Values in Multivariate Data Analysis".
Package details |
|
|---|---|
| Author | Hung Pham [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-8271-9355>) |
| Maintainer | Hung Pham <amser.hoanghung@gmail.com> |
| License | GPL (>= 2) |
| Version | 1.0.0 |
| URL | https://github.com/hhp94/slideimp https://hhp94.github.io/slideimp/ |
| Package repository | View on CRAN |
| Installation |
Install the latest version of this package by entering the following in R:
|
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.