IROmiss-package: Imputation Regularized Optimization Algorithm

Description Details Author(s) References Examples

Description

Missing data are frequently encountered in high-dimensional data analysis, but they are usually difficult to deal with using standard algorithms, such as the EM algorithm and its variants. This package provides a general algorithm, the so-called imputation regularized optimization (IRO) algorithm, for treating high-dimensional missing data problems. A variant of the IRO algorithm, the so-called imputation conditional regularized optimization (ICRO) algorithm, has also been provided in the package.

Details

Package: IROmiss
Type: Package
Version: 1.0.2
Date: 2020-02-19
License: GPL-2

This package illustrates the use of the IRO/ICRO algorithms in three modules:

The first module is to apply the IRO algorithm to learning high-dimensional Gaussian Graphical Models (GGMs) in presence of missing data with a simulated dataset SimGraDat(n,p,...) and Yeast cell example YeastIRO(data,...).

The second module is to apply the ICRO algorithm to varisable selection for high-dimensional linear regression in presence of missing data. The simulation study covers both cases, the covariates are mutually independent and generally dependent, with the code SimRegDat(n,p,...). The real data example is for Bardet-Biedl syndrome (Scheetz et al., 2006) with the dataset available in the R package flare.

The third module is to apply the ICRO algorithm to random coefficient linear models, where the random coefficients are treated as missing data. We can generate a dataset for the random coefficient linear models with SimRCLM(I,J,...) and a simulated dataset data(RCDat) is included in the package, which can be used in RCLM(I,J,RCDat,...) for estimate the random coefficents.

Author(s)

Bochao Jia, Faming Liang Maintainer: Bochao Jia<jbc409@ufl.edu>

References

Liang, F., Song, Q. and Qiu, P. (2015). An Equivalent Measure of Partial Correlation Coefficients for High Dimensional Gaussian Graphical Models. J. Amer. Statist. Assoc., 110, 1248-1265.<doi:10.1080/01621459.2015.1012391>

Liang, F. and Zhang, J. (2008) Estimating FDR under general dependence using stochastic approximation. Biometrika, 95(4), 961-977.<doi:10.1093/biomet/asn036>

Liang, F., Jia, B., Xue, J., Li, Q., and Luo, Y. (2018). An Imputation Regularized Optimization Algorithm for High-Dimensional Missing Data Problems and Beyond. Submitted to Journal of the Royal Statistical Society Series B. <arXiv:1802.02251>

Jia, B., Xu, S., Xiao, G., Lamba, V., Liang, F. (2017) Inference of Genetic Networks from Next Generation Sequencing Data. Biometrics.

Examples

1
2
3
4
5
6
7
8
9
    
library(IROmiss)
p <- 200
beta <- rep(0,p)
beta[1:5] <- c(1, 2, -1.5, -2.5, 5)
result <- SimRegDat(n = 100, p = 200, coef = beta, data.type = "indep",
miss.type="MCAR", rate = 0.05)
RegICRO(result$x, result$y, result$coef, type = "indep", iteration = 30, warm = 20)
      

IROmiss documentation built on March 26, 2020, 5:56 p.m.