gamlr: Gamma-Lasso regression

View source: R/gamlr.R

gamlrR Documentation

Gamma-Lasso regression

Description

Adaptive L1 penalized regression estimation.

Usage

gamlr(x, y, 
   family=c("gaussian","binomial","poisson"),
   gamma=0,nlambda=100, lambda.start=Inf,  
   lambda.min.ratio=0.01, free=NULL, standardize=TRUE, 
   obsweight=NULL,varweight=NULL,
   prexx=(p<500),
   tol=1e-7,maxit=1e5,verb=FALSE, ...)

## S3 method for class 'gamlr'
plot(x, against=c("pen","dev"), 
    col=NULL, select=TRUE, df=TRUE, ...)
## S3 method for class 'gamlr'
coef(object, select=NULL, k=2, corrected=TRUE, ...)
## S3 method for class 'gamlr'
predict(object, newdata,
            type = c("link", "response"), ...)
## S3 method for class 'gamlr'
logLik(object, ...)

Arguments

x

A dense matrix or sparse Matrix of covariates, with ncol(x) variables and nrow(x)==length(y) observations. This should not include the intercept.

y

A vector of response values. There is almost no argument checking, so be careful to match y with the appropriate family

family

Response model type; either "gaussian", "poisson", or "binomial". Note that for "binomial", y is in [0,1].

gamma

Penalty concavity tuning parameter; see details. Zero (default) yields the lasso, and higher values correspond to a more concave penalty.

nlambda

Number of regularization path segments.

lambda.start

Initial penalty value. Default of Inf implies the infimum lambda that returns all zero coefficients. This is the largest absolute coefficient gradient at the null model.

lambda.min.ratio

The smallest penalty weight (expected L1 cost) as a ratio of the path start value. Our default is always 0.01; note that this differs from glmnet whose default depends upon the dimension of x.

free

Free variables: indices of the columns of x which will be unpenalized.

standardize

Whether to standardize the coefficients to have standard deviation of one. This is equivalent to multiplying the L1 penalty by each coefficient standard deviation.

obsweight

For family="gaussian" only, weights on each observation in the weighted least squares objective. For other resonse families, obsweights are overwritten by IRLS weights. Defaults to rep(1,n).

varweight

Multipliers on the penalty associated with each covariate coefficient. Must be non-negative. These are further multiplied by sd(x_j) if standardize=TRUE. Defaults to rep(1,p).

prexx

Only possible for family="gaussian": whether to use pre-calculated weighted variable covariances in gradient calculations. This leads to massive speed-ups for big-n datasets, but can be slow for p>n datasets. See note.

tol

Optimization convergence tolerance relative to the null model deviance for each inner coordinate-descent loop. This is measured against the maximum coordinate change times deviance curvature after full parameter-set update.

maxit

Max iterations for a single segment coordinate descent routine.

verb

Whether to print some output for each path segment.

object

A gamlr object.

against

Whether to plot paths against log penalty or deviance.

select

In coef (and predict, which calls coef), the index of path segments for which you want coefficients or prediction (e.g., do select=which.min(BIC(object)) for BIC selection). If null, the segments are selected via our AICc function with k as specified (see also corrected). If select=0 all segments are returned.

In plot, select is just a flag for whether to add lines marking AICc and BIC selected models.

k

If select=NULL in coef or predict, the AICc complexity penalty. k defaults to the usual 2.

corrected

A flag that swaps corrected (for high dimensional bias) AICc in for the standard AIC. You almost always want corrected=TRUE, unless you want to apply the BIC in which case use k=log(n), corrected=FALSE.

newdata

New x data for prediction.

type

Either "link" for the linear equation, or "response" for predictions transformed to the same domain as y.

col

A single plot color, or vector of length ncol(x) colors for each coefficient regularization path. NULL uses the matplot default 1:6.

df

Whether to add to the plot degrees of freedom along the top axis.

...

Extra arguments to each method. Most importantly, from predict.gamlr these are arguments to coef.gamlr.

Details

Finds posterior modes along a regularization path of adapted L1 penalties via coordinate descent.

Each path segment t minimizes the objective -(\phi/n)logLHD(\beta_1 ... \beta_p) + \sum \omega_j\lambda|\beta_j|, where \phi is the exponential family dispersion parameter (\sigma^2 for family="gaussian", one otherwise). Weights \omega_j are set as 1/(1+\gamma|b_j^{t-1}|) where b_j^{t-1} is our estimate of \beta_j for the previous path segment (or zero if t=0). This adaptation is what makes the penalization ‘concave’; see Taddy (2013) for details.

plot.gamlr can be used to graph the results: it shows the regularization paths for penalized \beta, with degrees of freedom along the top axis and minimum AICc selection marked.

logLik.gamlr returns log likelihood along the regularization path. It is based on the deviance, and is correct only up to static constants; e.g., for a Poisson it is off by \sum_i y_i(\log y_i-1) (the saturated log likelihood) and for a Gaussian it is off by likelihood constants (n/2)(1+\log2\pi).

Value

lambda

The path of fitted prior expected L1 penalties.

nobs

The number of observations.

alpha

Intercepts.

beta

Regression coefficients.

df

Approximate degrees of freedom.

deviance

Fitted deviance: (-2/\phi)( logLHD.fitted - logLHD.saturated ).

iter

Number of optimization iterations by segment, broken into coordinate descent cycles and IRLS re-weightings for family!="gaussian".

family

The exponential family model.

Note

Under prexx=TRUE (requires family="gaussian"), weighted covariances (VX)'X and (VX)'y, weighted column sums of VX, and column means \bar{x} will be pre-calculated. Here V is the diagonal matrix of least squares weights (obsweights, so V defaults to I). It is not necessary (they will be built by gamlr otherwise), but you have the option to pre-calculate these sufficient statistics yourself as arguments vxx (matrix or dspMatrix), vxy, vxsum, and xbar (all vectors) respectively. Search PREXX in gamlr.R to see the steps involved, and notice that there is very little argument checking – do at your own risk. Note that xbar is an unweighted calculation, even if V \neq I. For really Big Data you can then run with x=NULL (e.g., if these statistics were calculated on distributed machines and full design is unavailable). Beware: in this x=NULL case our deviance (and df, if gamma>0) calculations are incorrect and selection rules will always return the smallest-lambda model.

Author(s)

Matt Taddy mataddy@gmail.com

References

Taddy (2017 JCGS), One-Step Estimator Paths for Concave Regularization, http://arxiv.org/abs/1308.5623

See Also

cv.gamlr, AICc, hockey

Examples


### a low-D test (highly multi-collinear)

n <- 1000
p <- 3
xvar <- matrix(0.9, nrow=p,ncol=p)
diag(xvar) <- 1
x <- matrix(rnorm(p*n), nrow=n)%*%chol(xvar)
y <- 4 + 3*x[,1] + -1*x[,2] + rnorm(n)

## run models to extra small lambda 1e-3xlambda.start
fitlasso <- gamlr(x, y, gamma=0, lambda.min.ratio=1e-3) # lasso
fitgl <- gamlr(x, y, gamma=2, lambda.min.ratio=1e-3) # small gamma
fitglbv <- gamlr(x, y, gamma=10, lambda.min.ratio=1e-3) # big gamma

par(mfrow=c(1,3))
ylim = range(c(fitglbv$beta@x))
plot(fitlasso, ylim=ylim, col="navy")
plot(fitgl, ylim=ylim, col="maroon")
plot(fitglbv, ylim=ylim, col="darkorange")

 

gamlr documentation built on April 17, 2023, 1:06 a.m.

Related to gamlr in gamlr...