regnet-package: regnet: Network-Based Regularization for Generalized Linear...

regnet-packageR Documentation

regnet: Network-Based Regularization for Generalized Linear Models

Description

Network-based regularization has achieved success in variable selection for high-dimensional biological data due to its ability to incorporate correlations among genomic features. This package provides procedures of network-based variable selection for generalized linear models (Ren et al. (2017) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1186/s12863-017-0495-5")} and Ren et al.(2019) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/gepi.22194")}). Continuous, binary, and survival response are supported. Robust network-based methods are available for continuous and survival responses.

This package provides the implementation of the network-based variable selection method in Ren et al (2017) and the robust network-based method in Ren et al (2019). In addition to the network penalty, regnet allows users to use classical LASSO and MCP penalties.

Details

Two easy-to-use, integrated interfaces, cv.regnet() and regnet() allow users to flexibly choose the method that they want to use. There are three arguments to control the fitting method

response: three types of response are supported: "binary", "continuous"
and "survival".
penalty: three choices of the penalty functions are available: "network",
"mcp" and "lasso".
robust: whether to use robust methods for modeling. Robust methods
are available for survival and continuous responses.

In penalized regression, the tuning parameter \lambda_{1} controls the sparsity of the coefficient profile. For network-based methods, an additional tuning parameter \lambda_{2} is needed for controlling the smoothness among coefficients. Typical usage of the package is to have the cv.regnet() compute the optimal values of lambdas, then provide them to the regnet() function for estimating the coefficients.

If the users want to include clinical variables that are not subject to the penalty in the model, the argument 'clv' can be used to indicate the positions of clinical variables in the X matrix. e.g. 'clv=(1:5)' meaning that the first five variables in X will not be penalized. It is recommended to put the clinical variables at the beginning of the X matrix in a contiguous way (see the 'Value' section of the regnet() function). However, non-contiguous indices, e.g. 'clv=(2,4,6)', are also allowed.

References

Ren, J., Du, Y., Li, S., Ma, S., Jiang,Y. and Wu, C. (2019). Robust network-based regularization and variable selection for high dimensional genomics data in cancer prognosis. Genet. Epidemiol., 43:276-291 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/gepi.22194")}

Wu, C., Zhang, Q., Jiang,Y. and Ma, S. (2018). Robust network-based analysis of the associations between (epi)genetic measurements. J Multivar Anal., 168:119-130 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jmva.2018.06.009")}

Wu, C., Jiang, Y., Ren, J., Cui, Y. and Ma, S. (2018). Dissecting gene-environment interactions: A penalized robust approach accounting for hierarchical structures. Statistics in Medicine, 37:437–456 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/sim.7518")}

Ren, J., He, T., Li, Y., Liu, S., Du, Y., Jiang, Y., and Wu, C. (2017). Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes. BMC Genetics, 18(1):44 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1186/s12863-017-0495-5")}

Wu, C., and Ma, S. (2015). A selective review of robust variable selection with applications in bioinformatics. Briefings in Bioinformatics, 16(5), 873–883 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bib/bbu046")}

Wu, C., Shi, X., Cui, Y. and Ma, S. (2015). A penalized robust semiparametric approach for gene-environment interactions. Statistics in Medicine, 34 (30): 4016–4030 \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/sim.6609")}

See Also

Useful links:

cv.regnet regnet

Examples


## Survival response using robust network method
data(SurvExample)
X = rgn.surv$X
Y = rgn.surv$Y
clv = c(1:5) # variables 1 to 5 are treated as clinical variables, we choose not to penalize them.
out = cv.regnet(X, Y, response="survival", penalty="network", clv=clv, robust=TRUE, verbo = TRUE)
out$lambda

fit = regnet(X, Y, "survival", "network", out$lambda[1,1], out$lambda[1,2], clv=clv, robust=TRUE)
index = which(rgn.surv$beta[-(1:6)] != 0)  # [-(1:6)] removes the intercept and clinical variables
pos = which(fit$coeff[-(1:6)] != 0)
tp = length(intersect(index, pos))
fp = length(pos) - tp
list(tp=tp, fp=fp)



jrhub/regnet documentation built on Feb. 22, 2024, 2:56 p.m.