glmPenaltyCV: Run cross-validation for the penalty parameter of the...
In PingYangChen/elnglm: Generalized Linear Model with Elastic Net Penalty

Description Usage Arguments Value Examples

View source: R/elnglm.R

Run cross-validation for the penalty parameter of the generalized linear model.

glmPenaltyCV(
  y,
  x,
  family = c("gaussian", "binomial", "multinomial"),
  lambdaLength = 100,
  minLambdaRatio = 0.001,
  lambdaVec = NULL,
  alpha = 0.5,
  standardize = TRUE,
  maxit = 100,
  tol = 1e-04,
  nfolds = 3,
  ver = c("r", "arma")
)

`y`	the vector of the response variable.
`x`	the matrix of the predictors.
`family`	string. One of the response families, "gaussian", "binomial" or "multinomial".
`lambdaLength`	integer. The number of tuning penalty parameters. The default is 100.
`minLambdaRatio`	double. The ratio of the minimal value to the maximal value of the penalty parameter. The default is `1e-3`.
`lambdaVec`	vector. The optional input of the tuning penalty parameters. The default is `NULL` that the function automatically computes the maximal value of the penalty parameter and generates a sequence of penalty parameter values of length `lambdaLength`.
`alpha`	double. The elastic net parameter between 0 and 1. The default value is 0.5.
`standardize`	boolean. If `TRUE`, the function first standardizes the predictor matrix.
`maxit`	integer. The number of maximal iterations of the coordinate descent algorithm. The default is 100.
`tol`	double. The value of the convergence tolerance of the coordinate descent algorithm. The default is `1e-4`.
`nfolds`	integer. The number of folds. The default value is 3.
`ver`	string. The version of the coordinate descent engine, "r": R codes or "arma": C++ codes with armadillo library.

An List.

nfolds the matrix of the predictors.
foldid the vector of the response variable.
lambdaBestId the vector of the response variable.
cvscore the vector of the response variable.

# Generate data of continuous response
trueb0 <- 1
trueact <- c(1, 1, 1, 0, 0, 0, 0, 0, 0, 0)
trueb <- runif(10, -1, 1)*10
trueb[which(trueact == 0)] <- 0 
df <- glmDataGen(n = 500, d = 10, family = "gaussian", trueb0, trueb, s = 0.5, seed = 1)

# Run cross-validation
mdlcv <- glmPenaltyCV(y = df$y, x = df$x, family = "gaussian", lambdaLength = 200,
                      minLambdaRatio = 1e-3, maxit = 1e5, tol = 1e-7, alpha = 0.5, nfolds = 10, ver = "arma")
# Best Lambda value
mdlcv$lambda[mdlcv$lambdaBestId]
plot(log(mdlcv$lambda), mdlcv$cvscore, type = "l", xlab = "log(lambda)", ylab = "rmse")
# Estimated intercept of the best model
print(mdlcv$b0[mdlcv$lambdaBestId])
# Estimated coefficients of the best model
print(mdlcv$b[,mdlcv$lambdaBestId])