This R package implements functions to perform variable selection with weighted lasso for both linear regression and the Cox proportional hazards regression. The weights are chosen to direct the variable selection procedure so that covariates that are highly associated with the response are likely to be selected and covariates that are weakly associated with the response are less likely to be selected. Association between the response and the covariates is based on results from simpler linear/Cox regressions between the response and each covariate, and include, for example, q-values, partial correlation coefficients, t-statistics of regression coefficients, and exclusion frequency statistics. The corresponding references are:
Garcia, T.P. and M¨uller, S. (2016). Cox regression with exclusion frequency-based weights to identify neuroimaging markers relevant to Huntington’s disease onset. Annals of Applied Statistics, 10, 2130-2156.
Garcia, T.P. and M¨uller, S. (2014). Influence of measures of significance-based weights in the weighted Lasso. Journal of the Indian Society of Agricultural Statistics (Invited paper), 68, 131-144.
Garcia, T.P., Mueller, S., Carroll, R.J., Dunn, T.N., Thomas, A.P., Adams, S.H., Pillai, S.D., and Walzem, R.L. (2013). Structured variable selection with q-values. Biostatistics, DOI:10.1093/biostatistics/kxt012.
You can install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("rakheon/d2wlasso", force=TRUE)
Here are different examples of using this R package:
library(d2wlasso)
##################
## Linear Model ##
##################
x = matrix(rnorm(100*5, 0, 1),100,5)
z = matrix(rbinom(100, 1, 0.5),100,1)
y = matrix(z[,1] + 2*x[,1] - 2*x[,2] + rnorm(100, 0, 1), 100)
dwl0 = d2wlasso(x,z,y)
dwl1 = d2wlasso(x,z=NULL,y,weight.type="corr.pvalue")
dwl2 = d2wlasso(x,z,y,weight.type="parcor.qvalue")
dwl3 = d2wlasso(x,z,y,weight.type="parcor.bh.pvalue")
dwl4 = d2wlasso(x,z,y,weight.type="parcor.qvalue",mult.cv.folds=100)
dwl5 = d2wlasso(x,z,y,weight.type="exfrequency.random.partition.aic")
dwl6 = d2wlasso(x,z,y,weight.type="exfrequency.kmeans.partition.aic")
dwl7 = d2wlasso(x,z,y,weight.type="exfrequency.kquartiles.partition.aic")
dwl8 = d2wlasso(x,z,y,weight.type="exfrequency.ksorted.partition.aic")
###############
## Cox model ##
###############
x = matrix(rnorm(100*5, 0, 1),100,5)
z = matrix(rbinom(100, 1, 0.5),100,1)
y = matrix(exp(z[,1] + 2*x[,1] - 2*x[,2] + rnorm(100, 0, 2)), 100)
cox.delta = matrix(1,nrow=length(y),ncol=1)
dwl0.cox = d2wlasso(x,z,y,cox.delta,regression.type="cox",penalty.choice="cv.mse")
dwl1.cox = d2wlasso(x,z=NULL,y,cox.delta,
regression.type="cox",weight.type="corr.pvalue",penalty.choice="cv.mse")
dwl2.cox = d2wlasso(x,z,y,cox.delta,
regression.type="cox",weight.type="parcor.qvalue",penalty.choice="cv.mse")
dwl3.cox = d2wlasso(x,z,y,cox.delta,
regression.type="cox",weight.type="parcor.bh.pvalue",penalty.choice="cv.mse")
dwl4.cox = d2wlasso(x,z,y,cox.delta,
regression.type="cox",weight.type="parcor.qvalue",
mult.cv.folds=100,penalty.choice="cv.mse")
dwl5.cox = d2wlasso(x,z,y,cox.delta,regression.type="cox",
weight.type="exfrequency.random.partition.aic",penalty.choice="cv.mse")
dwl6.cox = d2wlasso(x,z,y,cox.delta,regression.type="cox",
weight.type="exfrequency.kmeans.partition.aic",penalty.choice="cv.mse")
dwl7.cox = d2wlasso(x,z,y,cox.delta,regression.type="cox",
weight.type="exfrequency.kquartiles.partition.aic",penalty.choice="cv.mse")
dwl8.cox = d2wlasso(x,z,y,cox.delta,regression.type="cox",
weight.type="exfrequency.ksorted.partition.aic",penalty.choice="cv.mse")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.