RegICRO: Variable selection for high-dimensional Regression with...

Description Usage Arguments Value Author(s) References Examples

View source: R/RegICRO.R

Description

Application of the imputation conditional regularized optimization (ICRO) algorithm for high-dimensional variable selection in presence of missing data.

Usage

1
2
RegICRO(x, y, coef, type = "indep", alpha1 = 0.1, alpha2 = 0.05,
iteration = 30, warm = 20)

Arguments

x

nxp covariates matrix.

y

nx1 responses.

coef

A px1 vector of coefficients for the linear regression model. The intercept coefficient is default to 1.

type

When type=="indep", the case with independent covariates, or type=="dep", the case with dependent covariates, the default type is "indep".

alpha1

The significance level of correlation screening in the ψ-learning algorithm, see R package equSA for detail. In general, a high significance level of correlation screening will lead to a slightly large separator set, which reduces the risk of missing important variables in the conditioning set. In general, including a few false variables in the conditioning set will not hurt much the accuracy of the ψ-partial correlation coefficient, the default value is 0.1.

alpha2

The significance level of ψ-partial correlation coefficient screening for estimating the adjacency matrix, see equSA, the default value is 0.05.

iteration

The number of total iterations, the default value is 30.

warm

The number of burn-in iterations, the default value is 20.

Value

Var

Selected variables and their estimated coefficients by our ICRO algorithm.

table

The summarized table for evaluating the performance of ICRO algorithm. 'bias' denotes Euclidean distance between estimated coefficients and true coefficients; 'fsr' denotes false selection rate and 'nsr' denotes negative selection rate. The smaller the measurements are, the better the performance is.

Author(s)

Bochao Jiajbc409@ufl.edu and Faming Liang

References

Liang, F., Song, Q. and Qiu, P. (2015). An Equivalent Measure of Partial Correlation Coefficients for High Dimensional Gaussian Graphical Models. J. Amer. Statist. Assoc., 110, 1248-1265.

Liang, F. and Zhang, J. (2008) Estimating FDR under general dependence using stochastic approximation. Biometrika, 95(4), 961-977.

Liang, F., Jia, B., Xue, J., Li, Q., and Luo, Y. (2018). An Imputation Regularized Optimization Algorithm for High-Dimensional Missing Data Problems and Beyond. Submitted to Journal of the Royal Statistical Society Series B.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
      
    
library(IROmiss)
p <- 200
beta <- rep(0,p)
beta[1:5] <- c(1, 2, -1.5, -2.5, 5)
result <- SimRegDat(n = 100, p = 200, coef = beta, data.type = "indep", 
miss.type="MAR", rate = 0.05)
RegICRO(result$x, result$y, result$coef, type = "indep", iteration = 20, warm = 10)
      

IROmiss documentation built on March 26, 2020, 5:56 p.m.