MGLMtune: Choose the tuning parameter value in sparse regression
In MGLM: Multivariate Response Generalized Linear Models

MGLMtune

R Documentation

Choose the tuning parameter value in sparse regression

Description

Finds the tuning parameter value that yields the smallest BIC.

Usage

MGLMtune(
  formula,
  data,
  dist,
  penalty,
  lambdas,
  ngridpt,
  warm.start = TRUE,
  keep.path = FALSE,
  display = FALSE,
  init,
  weight,
  penidx,
  ridgedelta,
  maxiters = 150,
  epsilon = 1e-05,
  regBeta = FALSE,
  overdisp
)

Arguments

`formula`	an object of class `formula` (or one that can be coerced to that class): a symbolic description of the model to be fitted. The response has to be on the left hand side of ~.
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data` when using function `MGLMtune`, the variables are taken from `environment(formula)`, typically the environment from which `MGLMtune` is called.
`dist`	a description of the distribution to fit. See `dist` for the details.
`penalty`	penalty type for the regularization term. Can be chosen from `"sweep"`, `"group"`, or `"nuclear"`. See MGLMsparsereg for the description of each penalty type.
`lambdas`	an optional vector of the penalty values to tune. If missing, the vector of penalty values will be set inside the function. `ngridpt` must be provided if `lambdas` is missing.
`ngridpt`	an optional numeric variable specifying the number of grid points to tune. If `lambdas` is given, `ngridpt` will be ignored. Otherwise, the maximum λ is determined from the data. The smallest λis set to 1/n, where n is the sample size.
`warm.start`	an optional logical variable to specify whether to give warm start at each tuning grid point. If `warm.start=TRUE`, the fitted sparse regression coefficients will be used as the initial value when fitting the sparseregression with the next tuning grid.
`keep.path`	an optional logical variable controling whether to output the whole solution path. The default is `keep.path=FALSE`. If `keep.path=TRUE`, the sparse regression result at each grid point will be kept, and saved in the output object `select.list`.
`display`	an optional logical variable to specify whether to show each tuning step.
`init`	an optional matrix of initial value of the parameter estimates. Should have the compatible dimension with the data. See `dist` for details of dimensions in each distribution.
`weight`	an optional vector of weights assigned to each row of the data. Should be `NULL` or a numeric vector. Could be a variable from the `data`, or a variable from `environment(formula)` with the length equal to the number of rows of the data. If `weight=NULL`, equal weights of ones will be assigned.
`penidx`	a logical vector indicating the variables to be penalized. The default value is `rep(TRUE, p)`, which means all predictors are subject to regularization. If `X` contains intercept, use `penidx=c(FALSE,rep(TRUE,p-1))`.
`ridgedelta`	an optional numeric controlling the behavior of the Nesterov's accelerated proximal gradient method. The default value is 1/(pd).
`maxiters`	an optional numeric controlling the maximum number of iterations. The default value is `maxiters=150`.
`epsilon`	an optional numeric controlling the stopping criterion. The algorithm terminates when the relative change in the objective values of two successive iterates is less then `epsilon`. The default value is `epsilon=1e-5`.
`regBeta`	an optional logical variable used when running negative multinomial regression (`dist="NegMN"`). `regBeta` controls whether to run regression on the over-dispersion parameter. The default is `regBeta=FALSE`.
`overdisp`	an optional numerical variable used only when fitting sparse negative multinomial model and `regBeta=FALSE`. `overdisp` gives the over-dispersion value for all the observations. The default value is estimated using negative-multinomial regression. When `dist="MN", "DM", "GDM"` or `regBeta=TRUE`, the value of `overdisp` is ignored.

Value

select the final sparse regression result, using the optimal tuning parameter.
path a data frame with degrees of freedom and BICs at each lambda.

Author(s)

Yiwen Zhang and Hua Zhou

Examples

set.seed(118)
n <- 50
p <- 10
d <- 5
m <- rbinom(n, 100, 0.8)
X <- matrix(rnorm(n * p), n, p)
alpha <- matrix(0, p, d)
alpha[c(1, 3, 5), ] <- 1
Alpha <- exp(X %*% alpha)
Y <- rdirmn(size=m, alpha=Alpha)
sweep <- MGLMtune(Y ~ 0 + X, dist="DM", penalty="sweep", ngridpt=10)
show(sweep)

MGLM documentation built on April 14, 2022, 1:07 a.m.