cvSGL: Fit and cross-validate a GLM with a combination of lasso and...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Fits and cross-validates a regularized generalized linear model via penalized maximum likelihood. The model is fit for a path of values of the penalty parameter, and a parameter value is chosen by cross-validation. Fits linear and logistic models.

Usage

1
2
3
4
cvSGL(data, index = NULL, weights=NULL, type = c("linear","logit"), alphas = seq(0,1,.1),
	nlam = 20, standardize = c("train","self","all","no"), nfold = 10, measure = c("ll","auc"), 
	maxit = 1000, thresh = 0.001, min.frac = 0.05, gamma = 0.8, step = 1, reset = 10, ncores = 1,
	lambdas = NULL, verbose = FALSE)

Arguments

data

A list with components $x$, an input matrix of dimension $(n,p)$, and $y$, a response vector of length $n$. For type="logit" $y$ should be binary

index

A $p$-vector indicating group membership of each covariate

weights

Optional vector of weights for the group penalties

type

Model type: "linear" or "logit"

alphas

Vector of mixing parameters. alpha = 1 is the lasso penalty.

nlam

Number of lambda values in the regularization path

standardize

Type of standardization for full data and CV folds.

nfold

Number of folds of the cross-validation loop

measure

Performance measure used to select the best values alphas and lambdas

maxit

Maximum number of iterations to convergence

thresh

Convergence threshold for change in beta

min.frac

Minimum value of the penalty parameter, as a fraction of the maximum value

gamma

Fitting parameter used for tuning backtracking (between 0 and 1)

step

Fitting parameter used for initial backtracking step size (between 0 and 1)

reset

Fitting parameter used for taking advantage of local strong convexity in Nesterov momentum (number of iterations before momentum term is reset)

ncores

Number of computer cores to use in computations

lambdas

User-specified sequence of lambda values for fitting. We recommend leaving this NULL and letting cvSGL self-select values

verbose

Logical flag for whether or not step number will be output

Details

The function executes SGL nfold+1 times; the initial run is to find the lambda sequence, subsequent runs are used to compute the cross-validated error rate and its standard deviation. By default, weights are the square roots of group sizes.

Value

An object of class "cv.creNet" and "creNet" with components

fit

The fitted model using the best values of alphas and lambdas (class "creNet")

best.lambda

Index and value of the best element in lambdas

best.alpha

Index and value of the best element in alphas

lldiff

Cross-validation (negative) log likelihood for all alphas and lambdas (=squared error loss if type=linear)

llSD

Approximate standard deviations of lldiff

AUC

Area Under the Curve

lambdas

Values of lambda used in cross-validation.

alphas

User-specified argument alphas.

Author(s)

Kourosh Zarringhalam and David Degras

Modified from SGL package: Noah Simon, Jerome Friedman, Trevor Hastie, and Rob Tibshirani

Maintainer: Kourosh Zarringhalam <kourosh.zarringhalam@umb.edu>

References

Simon, N., Friedman, J., Hastie, T., and Tibshirani, R. (2011) A Sparse-Group Lasso,
http://web.stanford.edu/~hastie/Papers/SGLpaper.pdf

See Also

creSGL

Examples

1
2
3
4
5
6
7
8
9
set.seed(1)
n = 50; p = 100; size.groups = 10
index <- ceiling(1:p / size.groups)
X = matrix(rnorm(n * p), ncol = p, nrow = n)
beta = (-2:2)
y = X[,1:5] %*% beta + 0.1*rnorm(n)
data = list(x = X, y = y)
weights = rep(1, size.groups)
cvFit = cvcreSGL(data, index, weights, type = "linear", maxit = 1000, thresh = 0.001, min.frac = 0.05, nlam = 100, gamma = 0.8, nfold = 10, standardize = TRUE, verbose = FALSE, step = 1, reset = 10, alpha = 0.05, lambdas = NULL)

kouroshz/creNet documentation built on May 20, 2019, 1:11 p.m.