library(grpreg)
knitr::opts_knit$set(aliases=c(h = 'fig.height', w = 'fig.width'))
knitr::opts_chunk$set(comment="#", collapse=TRUE, cache=FALSE, tidy=FALSE)
knitr::knit_hooks$set(small.mar = function(before, options, envir) {
  if (before) par(mar = c(4, 4, .1, .1))
})

grpreg fits models that fall into the penalized likelihood framework, in which we estimate $\bb$ by minimizing the objective function

$$ Q(\bb|\X, \y) = L(\bb|\X,\y) + P_\lam(\bb), $$

where $L(\bb|\X,\y)$ is the loss (deviance) and $P_\lam(\bb)$ is the penalty. This article describes the different penalties available in grpreg; see models for more information on the different loss functions available.

The following notation is used throughout (recall that the design matrix $\X$ is decomposed into groups $\X_1, \X_2, \ldots$:

Group selection

These penalties are sparse at the group level -- the coefficients within a group will either all equal zero or none will equal zero.

If you use any of these penalties, please cite

The article goes into more mathematical details, discusses issues of standardization in the group sense, and provides references.

The group lasso was originally proposed in

Group lasso

grpreg(X, y, group, penalty="grLasso")

$$ P(\beta) = \lam \sum_j \norm{\bb_j} $$

Group MCP

grpreg(X, y, group, penalty="grMCP")

$$ P(\bb) = \sum_j \textrm{MCP}_{\lam, \gamma}(\norm{\bb_j}) $$

where $\textrm{MCP}_{\lam, \gamma}(\cdot)$ denotes the MCP penalty with regularization parameter $\lam$ and tuning parameter $\gamma$.

Group SCAD

grpreg(X, y, group, penalty="grSCAD")

$$ P(\bb) = \sum_j \textrm{SCAD}_{\lam, \gamma}(\norm{\bb_j}) $$

where $\textrm{SCAD}_{\lam, \gamma}(\cdot)$ denotes the SCAD penalty with regularization parameter $\lam$ and tuning parameter $\gamma$.

Bi-level selection

These penalties are sparse at both the group and individual levels. In some groups, all coefficients will equal zero. However, even if a group is selected, some of the coefficients within that group may still be zero.

Group exponential lasso (GEL)

grpreg(X, y, group, penalty="gel")

$$ P(\beta) = \sum_j f_{\lam, \tau}(\norm{\bb_j}) $$

where $f(\cdot)$ denotes the exponential penalty with regularization parameter $\lam$ and tuning parameter $\tau$:

$$ f_{\lam, \tau}(\theta) = \frac{\lam^2}{\tau}\left{1-\exp\left(-\frac{\tau\theta}{\lam}\right)\right} $$

If you use the GEL penalty, please cite

Composite MCP

grpreg(X, y, group, penalty="cMCP")

$$ P(\bb) = \sum_j \textrm{MCP}{\lam, \gam_1} \left( \sum_k \textrm{MCP}{\lam, \gam_2} (\abs{\beta_{jk}}) \right) $$

where $\textrm{MCP}_{\lam, \gamma}(\cdot)$ denotes the MCP penalty with regularization parameter $\lam$ and tuning parameter $\gamma$.

If you use the composite MCP penalty, please cite either of the following papers:

Please note that there is some confusion around the name "group MCP". In the first paper above (2009), the composite MCP penalty was referred to as the "group MCP" penalty; the second paper (2012), in reviewing the various kinds of group penalties that had been proposed, recommended changing the name to "composite MCP" to avoid confusion with the "group MCP" defined above.

Group bridge

gBridge(X, y, group)

$$ P(\bb) = \lambda \sum_j K_j^\gamma \norm{\bb_j}_1^\gamma $$

where $K_j$ denotes the number of elements in group $j$.

Please note that the group bridge penalty uses a very different algorithm from the other penalties. Due to the nature of the penalty, model fitting is slower and less stable for group bridge models. This is, in fact, the main motivation of the GEL penalty of Section~\ref{Sec:gel}: to offer a more tractable alternative to group bridge that has similar estimation properties but is much better behaved from a numerical optimization perspective.

If you use the group bridge penalty, please cite either of the following papers:

The first paper proposed the method; the second paper proposed the algorithm that is used in the grpreg package.

Specifying an additional ridge component

For all of the penalties in the previous section, grpreg allows the specification of an additional ridge ($L_2$) component to the penalty. This will set $\lam_1 = \alpha\lam$ and $\lam_2=(1-\alpha)\lam$, with the penalty given by

$$ P(\bb) = P_1(\bb|\lam_1) + \frac{\lam_2}{2}\norm{\bb}^2, $$

where $P_1$ is any of the penalties from the earlier sections. So, for example

grpreg(X, y, group, penalty="grLasso", alpha=0.75)

will fit a model with penalty

$$ P(\beta) = 0.75\lam \sum_j \norm{\bb_j} + \frac{0.25\lam}{2}\norm{\bb}^2. $$



Try the grpreg package in your browser

Any scripts or data that you put into this service are public.

grpreg documentation built on July 27, 2021, 1:08 a.m.