cv.glmmLasso: cv.glmmLasso

View source: R/cv.glmmLasso.R

cv.glmmLassoR Documentation

cv.glmmLasso

Description

Does k-fold cross validation for glmmLasso

Usage

cv.glmmLasso(fix, rnd, data, family = stats::gaussian(link = "identity"),
  kfold = 5, lambdas = NULL, nlambdas = 100,
  lambda.min.ratio = ifelse(nobs < nvars, 0.01, 1e-04), loss,
  lambda.final = c("lambda.1se", "lambda.min"), ...)

Arguments

fix

A two-sided linear formula object describing the fixed-effects part of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. For categorical covariables use as.factor(.) in the formula. Note, that the corresponding dummies are treated as a group and are updated blockwise

rnd

A two-sided linear formula object describing the random-effects part of the model, with the grouping factor on the left of a ~ operator and the random terms, separated by + operators, on the right; aternatively, the random effects design matrix can be given directly (with suitable column names). If set to NULL, no random effects are included.

data

The data frame containing the variables named in formula.

family

A GLM family, see glm() and family(). Also ordinal response models can be fitted: use family=acat() and family=cumulative() for the fitting of an adjacent category or cumulative model, respectively. If family is missing then a linear mixed model is fit; otherwise a generalized linear mixed model is fit.

kfold

Number of folds - default is 10. Although k-folds can be as large as the sample size (leave-one-out CV), it is not recommended for large datasets. Smallest value allowable is nfolds = 3

lambdas

Optional user-supplied lambda sequence; default is NULL, and glmmLasso_MultLambdas chooses its own sequence

nlambdas

The number of lambdas values, default value is 100 if lambdas is not user-supplied

lambda.min.ratio

Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value (i.e. the smallest value for which all coefficients are zero). The default depends on the sample size nobs relative to the number of variables nvars. If nobs > nvars, the default is 0.0001, close to zero. If nobs < nvars, the default is 0.01.

loss

Loss function used to calculate error, default values is based on family:

  • gaussian = cv.glmmLasso::calc_mse()

  • binomial = cv.glmmLasso::calc_logloss()

  • multinomial = cv.glmmLasso::calc_multilogloss()

  • poisson = cv.glmmLasso::calc_deviance()

lambda.final

Choice for final model to use lambda.1se or lambda.min, default is lambda.1se

...

can receive parameters accepted by glmmLasso

Details

Build multiple models given a sequence of lambda values

Value

A list of cross-validation values including:

lambdas

The values of lambda used in the fits

cvm

The mean cross-validated error - a vector of length length(lambda)

cvsd

Estimate of standard error of cvm.

cvup

Upper curve = cvm+cvsd.

cvlo

Lower curve = cvm-cvsd.

glmmLasso.final

A fitted glmmLasso object for the full data

lambda.min

Value of lambda that gives minimum cvm

lambda.1se

Largest value of lambda such that error is within 1 standard error of the minimum

Author(s)

Pirapong Jitngamplang, Jared Lander

Examples

data("soccer", package = "glmmLasso")
soccer[,c(4,5,9:16)]<-scale(soccer[,c(4,5,9:16)],center=TRUE,scale=TRUE)
soccer <- data.frame(soccer)

mod1 <- cv.glmmLasso(fix = points ~ transfer.spendings + ave.unfair.score + 
ball.possession + tackles, rnd = list(team=~1), data = soccer, 
family = gaussian(link = "identity"), kfold = 5, lambda.final = 'lambda.1se')

thepira/cv.glmmLasso documentation built on Dec. 11, 2022, 11:20 p.m.