weighted.lasso: Compute weighted lasso variable selection

Description Usage Arguments Value References

View source: R/main_functions.R

Description

Performs variable selection with covariates multiplied by weights that direct which variables are likely to be associated with the response.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
weighted.lasso(
  weights,
  weight_fn = function(x) {     x },
  yy,
  XX,
  z,
  data.delta,
  z.names,
  show.plots = FALSE,
  penalty.choice,
  est.MSE = c("est.var", "step")[2],
  cv.folds = 10,
  delta = 2
)

Arguments

weights

(m x 1) matrix that we use to multiply the m-covariates by.

weight_fn

A user-defined function to be applied to the weights for the weighted lasso. Default is an identify function.

yy

(n by 1) a matrix corresponding to the response variable. If regression.type is "cox", yy contains the observed event times.

XX

(n by K) matrix of main covariates where n is the sample size and K=m if z is NULL, and K= m+1 otherwise. Here, m refers to the number of x-covariates.

z

(n by 1) matrix of additional fixed covariate affecting response variable. This covariate will always be selected. Can be NULL.

data.delta

(n by 1) a matrix that denotes censoring when regression.type is "cox" (1 denotes survival event is observed, 0 denotes the survival event is censored). Can be NULL.

z.names

character denoting the column name of the z-covariate if z is not NULL. Can be NULL.

show.plots

logical indicator. If TRUE and penalty.choice is "penalized.lasso", a plot of the penalized lasso criterion versus steps in the LARS algorithm of Efron et al (2004).

penalty.choice

character that indicates the variable selection criterion. Options are "cv.mse" for the K-fold cross-validated mean squared prediction error, "penalized.loss" for the penalized loss criterion which requires specification of the penalization parameter penalized.loss.delta, "cv.penalized.loss" for the K-fold cross-validated criterion to determine delta in the penalized loss criterion, and "deviance.criterion" for optimizing the Cox proportional hazards deviance (only available when regression.type is "cox".) Defalt is "penalized.loss".

est.MSE

character that indicates how the mean squared error is estimated in the penalized loss criterion when penalty.choice is "penalized.loss" or "cv.penalized.loss". Options are "est.var" which means the MSE is sd(y) * sqrt(n/(n-1)) where n is the sample size, and "step" which means we use the MSE from forward stepwise regression with AIC as the selection criterion. Default is "est.var".

cv.folds

scalar denoting the number of folds for cross-validation when penalty.choice is "cv.mse" or "cv.penalized.loss". Default is 10.

delta

scalar to indicate the choice of the penalization parameter delta in the penalized loss criterion when penalty.choice is "penalized.loss".

Value

References

Efron, B., Hastie, T., Johnstone, I. AND Tibshirani, R. (2004). Least angle regression. Annals of Statistics 32, 407–499.

Garcia, T.P. and M¨uller, S. (2016). Cox regression with exclusion frequency-based weights to identify neuroimaging markers relevant to Huntington’s disease onset. Annals of Applied Statistics, 10, 2130-2156.

Garcia, T.P. and M¨uller, S. (2014). Influence of measures of significance-based weights in the weighted Lasso. Journal of the Indian Society of Agricultural Statistics (Invited paper), 68, 131-144.

Garcia, T.P., Mueller, S., Carroll, R.J., Dunn, T.N., Thomas, A.P., Adams, S.H., Pillai, S.D., and Walzem, R.L. (2013). Structured variable selection with q-values. Biostatistics, DOI:10.1093/biostatistics/kxt012.

Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences 100, 9440-9445.


rakheon/d2wlasso documentation built on Feb. 26, 2020, 10:39 p.m.