penZINB: Penalized zero-inflated negative binomial regression
In yliu433/scZINB: Graphical model estimation for zero-inflated count data

Description Usage Arguments Details Value See Also

Perform variable selection for ZINB regression via penalized maximum likelihood.

penZINB(y, X, unpenalizedx = NULL, unpenalizedz = NULL, lambdas = NULL,
  taus = NULL, nlambda = 30, ntau = 5, naPercent = 0.4, maxIT = 1000,
  maxIT2 = 25, track = NULL, theta.st = NULL, stepThrough = NULL,
  optimType = "EM", loud = NULL, warmStart = FALSE, bicgamma = NULL,
  irlsConv = FALSE, weightedPen = TRUE, numericalDeriv = FALSE,
  pfactor = 0.01, oneTheta = FALSE, maxOptimIT = 50, eps = 1e-05,
  convType = 1, start = NULL, order = FALSE, penType = 1)

`y`	zero-inflated count response
`X`	covariate matrix. Intercept is added within the function. This could take '1' as the input which indicates an intercept-only model.
`unpenalizedx, unpenalizedz`	Additional unpenalized covariates for negative binomial and logistic regression respectively. Default is `NULL`.
`lambdas, taus`	specific tuning parameter values you want to run the model with. Default is `NULL` where the function will auto-generate a tuning parameter search grid. If default is used, must have input for nlambda and ntau.
`nlambda, ntau`	number of unique lambda and tau values - default are 30 and 5.
`naPercent`	allowable percentage of observations with missing values - default is .4.
`maxIT`	maximum number of EM iterations - default is 1000.
`maxIT2`	maximum number of iterations for updating the coefficients in the regression model - default is 25.
`track`	default is `NULL` (deactivated). Otherwise, it takes a single integer value which activates tracking mode for that tuning parameter pair. See output change details below.
`theta.st`	default is `NULL` (deactivated) where theta estimation is done using MLE. Otherwise, takes a single value for theta to hold constant for all estimation.
`stepThrough`	default is `NULL` (deactivated). Otherwise needs to be a length 2 vector to activate debugging mode. The first number is the theta iteration and second is the EM iteration to prompt stepthrough debugger.
`optimType`	options are "EM" and "optim". Default is "EM" which runs the EM algorithm prior to using BFGS optimization. "optim" skips the EM algorithm.
`loud`	default is `NULL` (deactivated). Otherwise takes a positive integer x to announce at every xth iteration of EM and numerical optimization algorithm.
`warmStart`	default is FALSE, which uses the same starting point for all tp. Other options are 'cond', which resets the the starting point to the original starting point when non-convergence happens. TRUE keeps previous estimates as starting points for estimation for the next tuning parameter.
`bicgamma`	the parameter used in the extended BIC. Default is `NULL`, which uses the log(the dimension)/log(the sample size).
`irlsConv`	forces each estimate of beta and gamma to converge first if set to TRUE. Default is FALSE.
`weightedPen`	default is TRUE. Weights the penalty using the Hessian.
`numericalDeriv`	default is FALSE. Calculates the Hessian numerically when set to TRUE. Otherwise, calculates it analytically.
`pfactor`	default is 1e-2. The multiplier for the largest calculated penalty to determine smallest penalty value. Use in conjunction with nlambda/ntau to control the granularity of the tp grid.
`oneTheta`	default is FALSE (deactivated). If set to TRUE, only estimates theta once per tuning parameter pair.
`maxOptimIT`	maximum number of iterations for numerical optimization (BFGS) after the EM algorithm. By default is set to 50. Convergence time is long.
`eps`	threshold for convergence for the EM algorithm - default is 1e-5.
`convType`	manages the order of convergence within the EM algorithm. Options are 1 (default) and 2. Type 1 forces convergence of the binomial and negative binomial parts together. Type 2 forces convergence of binomial part first, then negative binomial part.
`start`	default is `NULL` which sets starting coefficients values to 0. If set to 'jumpstart', then will estimate the starting coefficients from penalized negative binomial estimation and logistic regression based on the penalized library. Otherwise, can also take direct input for starting values. Must be in the form of list(betas = v1, gammas = v2), where v1 and v2 are vectors the length of the number of covariates in X.
`order`	default is FALSE. If TRUE, then order of estimation is ordered by marginal correlation with response.
`penType`	options are 1 (default) or 2. 1 is the group log penalty. 2 is lasso.

If tracking, this function returns a nested list of all estimated with the following hierarchichy:

Level 1 - TP (may not be included if only 1 tp pair is tracked); last two elements are lambda and tau,
Level 2 - Theta Estimate (up to 20); last two elements are theta,
Level 3 - EM Iteration (up tp max it),

with the following values: loglik, loglik.em, loglikZI, loglikNB, pen, betas, gammas.

A list with each element corresponding to each tuning parameter pair. Each element contains the following components:

X: The design matrix used for calculations. Non-empty only for the first tuning parameter pair.
betas: Non-zero beta coefficients corresponding to betas.w.
gammas: Non-zero gamma coefficients corresponding to gammas.w.
loglik.obs: Observed data log likelihood at convergence.
pen: Value of the penalty at convergence.
theta.r: Theta path.
theta: Theta estimate at convergence.
BIC: BIC value at convergence.
extBIC: Extended BIC value at convergence.
extBICGG: Extended BIC GG value at convergence.
lambda, tau: Tuning parameter pair used.