# EBglmnet-package: Empirical Bayesian Lasso (EBlasso) and Elastic Net (EBEN)... In EBglmnet: Empirical Bayesian Lasso and Elastic Net Methods for Generalized Linear Models

## Description

Fast Empirical Bayesian Lasso (EBlasso) and Elastic Net (EBEN) are generalized linear regression methods for variable selections and effect estimations. Similar as `lasso` and `elastic net` implemented in the package glmnet, EBglmnet features the capabilities of handling p>>n data, where `p` is the number of variables and `n` is the number of samples in the regression model, and inferring a sparse solution such that irrelevant variables will have exactly zero value on their regression coefficients. Additionally, there are several unique features in EBglmnet:

1) Both `EBlasso` and `EBEN` can select more than `n` nonzero effects.
2) EBglmnet also performs hypothesis testing for the significance of nonzero estimates.
3) EBglmnet includes built-in functions for epistasis analysis.

There are three sets of hierarchical prior distributions implemented in EBglmnet:

1) EBlasso-NE is a two-level prior with (normal + exponential) distributions for the regression coefficients.
2) EBlasso-NEG is a three-level hierarchical prior with (normal + exponential + gamma) distributions.
3) EBEN implements a normal and generalized gamma hierarchical prior.

While those sets of priors are all "peak zero and flat tails", `EBlasso-NE` assigns more probability mass to the tails, resulting in more nonzero estimates having large p-values. In contrast, `EBlasso-NEG` has a third level constraint on the `lasso` prior, which results in higher probability mass around zero, thus more sparse results in the final outcome. Meanwhile, `EBEN` encourages a grouping effect such that highly correlated variables can be selected as a group. Similar as the relationship between `elastic net` and `lasso`, there are two parameters (α, λ) required for `EBEN`, and it is reduced to `EBlasso-NE` when parameter α = 1. We recommend using EBlasso-NEG when there are a large number of candidate effects, using EBlasso-NE when effect sizes are relatively small, and using EBEN when groups of highly correlated variables such as co-regulated gene expressions are of interest.

Two models are available for both methods: linear regression model and logistic regression model. Other features in this package includes:
* 1 * epistasis (two-way interactions) can be included for all models/priors;
* 2 * model implemented with memory efficient `C` code;
* 3 * LAPACK/BLAS are used for most linear algebra computations.

Several simulation and real data analysis in the reference papers demonstrated that EBglmnet enjoys better performance than `lasso` and `elastic net` methods in terms of power of detection, false discover rate, as well as encouraging grouping effect when applicable.

Key Algorithms are described in the following paper:
1. EBlasso-NEG: (Cai X., Huang A., and Xu S., 2011), (Huang A., Xu S., and Cai X., 2013)
2. EBlasso-NE: (Huang A., Xu S., and Cai X., 2013)
3. group EBlasso: (Huang A., Martin E., et al. 2014)
4. EBEN: (Huang A., Xu S., and Cai X., 2015)
5. Whole-genome QTL mapping: (Huang A., Xu S., and Cai X., 2014)

## Details

 Package: EBglmnet Type: Package Version: 4.1 Date: 2016-01-15 License: gpl

## Author(s)

Anhui Huang, Dianting Liu
Maintainer: Anhui Huang <a.huang1@umiami.edu>

## References

Huang, A., Xu, S., and Cai, X. (2015). Empirical Bayesian elastic net for multiple quantitative trait locus mapping. Heredity 114(1): 107-115.

Huang, A., E. Martin, et al. (2014). "Detecting genetic interactions in pathway-based genome-wide association studies." Genet Epidemiol 38(4): 300-309.

Huang, A., S. Xu, et al. (2014). "Whole-genome quantitative trait locus mapping reveals major role of epistasis on yield of rice." PLoS ONE 9(1): e87330.

Huang, A. (2014). "Sparse model learning for inferring genotype and phenotype associations." Ph.D Dissertation. University of Miami(1186).

Huang A, Xu S, Cai X. (2013). Empirical Bayesian LASSO-logistic regression for multiple binary trait locus mapping. BMC genetics 14(1):5.

Cai, X., Huang, A., and Xu, S. (2011). Fast empirical Bayesian LASSO for multiple quantitative trait locus mapping. BMC Bioinformatics 12, 211.

EBglmnet documentation built on May 2, 2019, 2:46 a.m.