gren: Group-regularized logistic elastic net regression
In gren: Adaptive Group-Regularized Logistic Elastic Net Regression

Description Usage Arguments Details Value Author(s) References See Also Examples

Function that estimates a group-regularized elastic net model.

gren(x, y, m=rep(1, nrow(x)), unpenalized=NULL, partitions=NULL, alpha=0.5, 
     lambda=NULL, intercept=TRUE, monotone=NULL, psel=TRUE, compare=TRUE, 
     posterior=FALSE, nfolds=nrow(x), foldid=NULL, trace=TRUE,
     init=list(lambdag=NULL, mu=NULL, sigma=NULL, chi=NULL, ci=NULL),
     control=list(epsilon=0.001, maxit=500, maxit.opt=1000, maxit.vb=100))

`x`	feature data as either `numeric` `matrix` or `data.frame` of `numeric` variables.
`y`	response as either a `numeric` with binomial/binary successes of length `nrow(x)` or a `matrix` of `nrow(x)` rows and two columns, where the first column contains the binomial/binary failures and the second column the binomial/binary successes.
`m`	`numeric` of length `nrow(x)` that contains the number of Bernoulli trials.
`unpenalized`	Optional `numeric` `matrix` or `data.frame` of `numeric` unpenalized covariates of `nrow(x)` rows.
`partitions`	`list` that contains the (possibly multiple) partitions of the data. Every `list` object corresponds to one partition, where every partition is a `numeric` of length `ncol(x)` containing the group ids of the features.
`alpha`	proportion of L1 penalty as a `numeric` of length 1.
`lambda`	global penalty parameter. The default `NULL` will result in estimation by cross-validation.
`intercept`	`logical` to indicate whether an intercept should be included.
`monotone`	`list` of two `logical` vectors of length `length(partitions)`. The first one `monotone` indicates whether the corresponding partition's penalty parameters should be monotonically estimates, the second vector `decreasing` indicates whether the monotone penalty parameters are decreasing with group number.
`psel`	either a `numeric` vector that indicates the number of features to select or a `logical`. If `TRUE` feature selection is done by letting `glmnet` determine the penalty parameter sequence.
`compare`	`logical`, if `TRUE`, a regular non-group-regularized model is estimated.
`posterior`	if `TRUE`, the full variational Bayes posterior is returned.
`nfolds`	`numeric` of length 1 with the number of folds used in the cross-validation of the global `lambda`. The default is `nrow(x)`.
`foldid`	optional `numeric` vector of length `nrow(x)` with the fold assignments of the observations.
`trace`	if `TRUE`, progress of the algorithm is printed.
`init`	optional `list` containing the starting values of the iterative algorithm. See Details for more information.
`control`	a `list` of algorithm control parameters. See Details for more information.

This is the main function of the package that estimates a group-regularized elastic net regression. The elastic net penalty's proportion of L1-norm penalisation is determined by alpha. alpha close to 0 implies more ridge-like penalty, while alpha close to 1 implies lasso-like penalty. The algorithm is a two-step procedure: first, a global lambda penalty is estimates by cross-validation. Next, the groupwise lambda multipliers are estimates by an EM algorithm. The EM algorithm consists of: i) an expectation step in which the expected marginal likelihood of the penalty multipliers is iteratively approximated by a variational Bayes EM algorithm and ii) a maximisation step in which the approximate expected marginal likelihood is maximised with respect to the penalty multipliers. After convergence of the algorithm an (optional) frequentist elastic net model is fit using the estimated penalty multipliers by setting psel=TRUE or by setting psel to a numeric vector.

The user may speed up the procedure by specifying initial values for the EM algorithm in init. init is a list that contains:

lambdag: initial values for λ_g in a list of length length(partitions).
mu: initial values for the μ_j in a numeric vector of length ncol(x) + ncol(unpenalized) + intercept.
chi: initial values for the χ_j in a numeric vector of length ncol(x).
ci: initial values for the c_i in a numeric vector of length nrow(x).
sigma: The initial values for the Σ_{ij} in a matrix of numerics with ncol(x) rows and columns.

control is a list with parameters to control the estimation procedure. It consists of the following components:

epsilon: numeric with the relative convergence tolerance. Default is epsilon=0.001.
maxit: numeric with whole number that gives the maximum number of iterations to update the lambdag. Default is maxit=500.
maxit.opt: numeric with whole number that gives the maximum number of iterations to numerically maximise the lambdag. Maximisation occurs at every iteration. Default is maxit.opt=1000.
maxit.vb: numeric with whole number that gives the maximum number of iterations to update the variational parameters mu, sigma, chi, and ci. One full update sequence per iteration. Default is maxit=100.

Function returns an S3 list object of class gren containing output with the following components:

`call`	The function call that produced the output.
`alpha`	proportion of L1 penalty as a `numeric` of length 1.
`lambda`	global penalty parameter as `numeric`. Estimated by cross-validation if `lambda=NULL`.
`lambdag.seq`	`list` with full sequence of penalty multipliers over iterations.
`lambdag`	`list` with final estimates of penalty multipliers.
`vb.post`	`list` with variational posterior parameters mu_j, sigma_{ij}, c_i, and chi_j.
`freq.model`	frequentist elastic net model as output of `glmnet` call. `NULL` if `psel=FALSE`.
`iter`	`list` with number of iterations of `lambdag` estimation, with number of optimisation iterations of `lambdag`, and number of variational Bayes iterations.
`conv`	`list` of `logical`s with convergence of `lambdag` sequence, optimisation steps, and variational Bayes iterations.
`args`	`list` with input arguments of `gren` call.

Magnus M. Münch <m.munch@vumc.nl>

Münch, M.M., Peeters, C.F.W., van der Vaart, A.W., and van de Wiel, M.A. (2018). Adaptive group-regularized logistic elastic net regression. arXiv:1805.00389v1 [stat.ME].

predict.gren, coef.gren, cv.gren

## Create data
p <- 1000
n <- 100
set.seed(2018)
x <- matrix(rnorm(n*p), ncol=p, nrow=n)
beta <- c(rnorm(p/2, 0, 0.1), rnorm(p/2, 0, 1))
m <- rep(1, n)
y <- rbinom(n, m, as.numeric(1/(1 + exp(-x %*% as.matrix(beta)))))
partitions <- list(groups=rep(c(1, 2), each=p/2))

## estimate model
fit.gren <- gren(x, y, m, partitions=partitions)