bmlasso: Bayesian Spike-and-Slab Lasso Models
In nyiuab/BhGLM: Bayesian hierarchical GLMs and survival models, with applications to Genomics and Epidemiology

bmlasso

R Documentation

Bayesian Spike-and-Slab Lasso Models

Description

This function is to set up Bayesian GLMs or Cox survival models with spike-and-slab mixture double-exponential prior or spike-and-slab mixture normal prior, and to fit the model by incorporating EM steps into the fast coordinate descent algorithm.

Usage

    
bmlasso(x, y, family = c("gaussian", "binomial", "poisson", "cox"), 
        offset = NULL, epsilon = 1e-04, maxit = 50, init = NULL, 
        alpha=c(1, 0), ss = c(0.04, 0.5), b=1, group = NULL, 
        Warning = FALSE, verbose = FALSE)

Arguments

`x`	input matrix, of dimension nobs x nvars; each row is an observation vector.
`y`	response variable. Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial", y should be either a factor with two levels, or a two-column matrix of counts or proportions (the second column is treated as the target class; for a factor, the last level in alphabetical order is the target class). For family="cox", y should be a two-column matrix with columns named 'time' and 'status'. The latter is a binary variable, with '1' indicating death, and '0' indicating right censored. The function Surv() in package survival produces such a matrix.
`family`	Response type (see above).
`offset`	A vector of length nobs that is included in the linear predictor.
`epsilon`	positive convergence tolerance e; the iterations converge when \|dev - dev_old\|/(\|dev\| + 0.1) < e.
`maxit`	integer giving the maximal number of EM iterations.
`init`	vector of initial values for all coefficients (not for intercept). If not given, it will be internally produced.
`alpha`	`alpha=1`: mixture double-exponential prior; `alpha=0`: mixture normal prior.
`ss`	a vector of two positive scale values (ss[1] < ss[2]) for the spike-and-slab mixture prior, leading to different shrinkage on different predictors and allowing for incorporation of group information.
`b`	group-specific inclusion probabilities follow beta(1,b). The tuning parameter `b` can be a vector of group-specific values.
`group`	a numeric vector, or an integer, or a list defining the groups of predictors. If `group = NULL`, all the predictors form a single group. If `group = K`, the predictors are evenly divided into groups each with `K` predictors. If `group` is a numberic vector, it defines groups as follows: Group 1: `(group[1]+1):group[2]`, Group 2: `(group[2]+1):group[3]`, Group 3: `(group[3]+1):group[4]`, ..... If `group` is a list of variable names, `group[[k]]` includes variables in the k-th group. The mixture prior is only used for grouped predictors. For ungrouped predictors, the prior is double-exponential or normal with scale `ss[2]` and mean 0.
`Warning`	logical. If `TRUE`, show the error messages of not convergence and identifiability.
`verbose`	logical. If `TRUE`, print out number of iterations and computational time.

Details

This function sets up Bayesian GLMs and Cox models with spike-and-slab mixture double-exponential or normal prior (Bayesian spike-and-slab mixture lasso or ridge), and fits the model by incorporating EM steps into the fast coordinate descent algorithm implemented in the package glmnet. It takes advantage of the function glmnet in the package glmnet.

Value

This function returns all outputs from the function glmnet, and some other values used in Bayesian hierarchical models.

Author(s)

Nengjun Yi, nyi@uab.edu

References

Friedman, J., Hastie, T. and Tibshirani, R. (2010) Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw 33, 1-22.

Simon, N., Friedman, J., Hastie, T. & Tibshirani, R. (2011) Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent. Journal of Statistical Software 39, 1-13.

Zaixiang Tang, Yueping Shen, Xinyan Zhang, Nengjun Yi (2017) The Spike-and-Slab Lasso Generalized Linear Models for Prediction and Associated Genes Detection. Genetics 205, 77-88.

Zaixiang Tang, Yueping Shen, Xinyan Zhang, Nengjun Yi (2017) The Spike-and-Slab Lasso Cox Models for Survival Prediction and Associated Genes Detection. Bioinformatics, 33(18), 2799-2807.

Zaixiang Tang, Yueping Shen, Yan Li, Xinyan Zhang, Jia Wen, Chen'ao Qian, Wenzhuo Zhuang, Xinghua Shi, and Nengjun Yi (2018) Group Spike-and-Slab Lasso Generalized Linear Models for Disease Prediction and Associated Genes Detection by Incorporating Pathway Information. Bioinformatics 34(6): 901-910.

Zaixiang Tang, Yueping Shen, Shu-Feng Lei, Xinyan Zhang, Zixuan Yi, Boyi Guo, Jake Chen, and Nengjun Yi (2019) Gsslasso Cox: a fast and efficient pathway-based framework for predicting survival and detecting associated genes. BMC Bioinformatics 20(94).

Examples

library(BhGLM)
library(survival)
library(glmnet)


N = 1000
K = 100
x = sim.x(n=N, m=K, corr=0.6) # simulate correlated continuous variables  
h = rep(0.1, 4) # assign four non-zero main effects to have the assumed heritabilty 
nz = as.integer(seq(5, K, by=K/length(h))); nz
yy = sim.y(x=x[, nz], mu=0, herit=h, p.neg=0.5, sigma=1.6) # simulate responses
yy$coefs

# y = yy$y.normal; fam = "gaussian"; y = scale(y)
# y = yy$y.ordinal; fam = "binomial"
y = yy$y.surv; fam = "cox" 

group = NULL
#group = rep(0, 21)
#for(j in 1:length(group)) group[j] = (j-1) * K/(length(group)-1)

# lasso and mixture lasso

f1 = glmNet(x, y, family = fam, ncv = 1) 

ps = f1$prior.scale; ps 
ss = c(ps, 0.5)
f2 = bmlasso(x, y, family = fam, ss = ss, group = group)

par(mfrow = c(1, 2), mar = c(3, 4, 4, 4))
gap = 10
plot.bh(coefs = f1$coef, threshold = f1$df, gap = gap, main = "lasso") 
plot.bh(coefs = f2$coef, threshold = f2$df, gap = gap, main = "mixture lasso")

nyiuab/BhGLM documentation built on June 12, 2024, 9:28 p.m.