zipath: Fit zero-inflated count data linear model with lasso (or...
In mpath: Regularized Linear Models

zipath

R Documentation

Fit zero-inflated count data linear model with lasso (or elastic net), snet or mnet regularization

Description

Fit zero-inflated regression models for count data via penalized maximum likelihood.

Usage

## S3 method for class 'formula'
zipath(formula, data, weights, offset=NULL, contrasts=NULL, ... )
## S3 method for class 'matrix'
zipath(X, Z, Y, weights, offsetx=NULL, offsetz=NULL, ...)
## Default S3 method:
zipath(X, ...)

Arguments

`formula`	symbolic description of the model, see details.
`data`	argument controlling formula processing via `model.frame`.
`weights`	optional numeric vector of weights.
`offset`	optional numeric vector with an a priori known component to be included in the linear predictor of the count model or zero model. See below for an example.
`contrasts`	a list with elements `"count"` and `"zero"` containing the contrasts corresponding to `levels` from the respective models
`X`	predictor matrix of the count model
`Z`	predictor matrix of the zero model
`Y`	response variable
`offsetx`, `offsetz`	optional numeric vector with an a priori known component to be included in the linear predictor of the count model (offsetx)or zero model (offsetz).
`...`	Other arguments which can be passed to `glmreg` or `glmregNB`

Value

An object of class "zipath", i.e., a list with components including

`coefficients`	a list with elements `"count"` and `"zero"` containing the coefficients from the respective models,
`residuals`	a vector of raw residuals (observed - fitted),
`fitted.values`	a vector of fitted means,
`weights`	the case weights used,
`terms`	a list with elements `"count"`, `"zero"` and `"full"` containing the terms objects for the respective models,
`theta`	estimate of the additional `\theta` parameter of the negative binomial model (if a negative binomial regression is used),
`loglik`	log-likelihood of the fitted model,
`family`	character string describing the count distribution used,
`link`	character string describing the link of the zero-inflation model,
`linkinv`	the inverse link function corresponding to `link`,
`converged`	logical value, TRUE indicating successful convergence of `zipath`, FALSE indicating otherwise
`call`	the original function call
`formula`	the original formula
`levels`	levels of the categorical regressors
`contrasts`	a list with elements `"count"` and `"zero"` containing the contrasts corresponding to `levels` from the respective models,
`model`	the full model frame (if `model = TRUE`),
`y`	the response count vector (if `y = TRUE`),
`x`	a list with elements `"count"` and `"zero"` containing the model matrices from the respective models (if `x = TRUE`),

Author(s)

Zhu Wang <zwang145@uthsc.edu>

References

Zhu Wang, Shuangge Ma, Michael Zappitelli, Chirag Parikh, Ching-Yun Wang and Prasad Devarajan (2014) Penalized Count Data Regression with Application to Hospital Stay after Pediatric Cardiac Surgery, Statistical Methods in Medical Research. 2014 Apr 17. [Epub ahead of print]

Zhu Wang, Shuangge Ma, Ching-Yun Wang, Michael Zappitelli, Prasad Devarajan and Chirag R. Parikh (2014) EM for Regularized Zero Inflated Regression Models with Applications to Postoperative Morbidity after Cardiac Surgery in Children, Statistics in Medicine. 33(29):5192-208.

Zhu Wang, Shuangge Ma and Ching-Yun Wang (2015) Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany, Biometrical Journal. 57(5):867-84.

Examples

## data
data("bioChemists", package = "pscl")
## with simple inflation (no regressors for zero component)
fm_zip <- zipath(art ~ 1 | ., data = bioChemists, nlambda=10)
summary(fm_zip)
fm_zip <- zipath(art ~ . | 1, data = bioChemists, nlambda=10)
summary(fm_zip)
## Not run: 
fm_zip <- zipath(art ~ . | 1, data = bioChemists, nlambda=10)
summary(fm_zip)
fm_zinb <- zipath(art ~ . | 1, data = bioChemists, family = "negbin", nlambda=10)
summary(fm_zinb)
## inflation with regressors
## ("art ~ . | ." is "art ~ fem + mar + kid5 + phd + ment | fem + mar + kid5 + phd + ment")
fm_zip2 <- zipath(art ~ . | ., data = bioChemists, nlambda=10)
summary(fm_zip2)
fm_zinb2 <- zipath(art ~ . | ., data = bioChemists, family = "negbin", nlambda=10)
summary(fm_zinb2)
### non-penalized regression, compare with zeroinfl
fm_zinb3 <- zipath(art ~ . | ., data = bioChemists, family = "negbin", 
lambda.count=0, lambda.zero=0, reltol=1e-12)
summary(fm_zinb3)
library("pscl")
fm_zinb4 <- zeroinfl(art ~ . | ., data = bioChemists, dist = "negbin")
summary(fm_zinb4)
### offset
exposure <- rep(0.5, dim(bioChemists)[1])
fm_zinb <- zipath(art ~ . +offset(log(exposure))| ., data = bioChemists, 
		  family = "poisson", nlambda=10)
coef <- coef(fm_zinb)
### offset can't be specified in predict function as it has been contained
pred <- predict(fm_zinb)
## without inflation
## ("art ~ ." is "art ~ fem + mar + kid5 + phd + ment")
fm_pois <- glmreg(art ~ ., data = bioChemists, family = "poisson")
coef <- coef(fm_pois)
fm_nb <- glmregNB(art ~ ., data = bioChemists)
coef <- coef(fm_nb)
### high-dimensional
#R CMD check --use-valgrind can be too time extensive for the following model
#bioChemists <- cbind(matrix(rnorm(915*100), nrow=915), bioChemists)
#fm_zinb <- zipath(art ~ . | ., data = bioChemists, family = "negbin", nlambda=10)

## End(Not run)

mpath documentation built on June 28, 2024, 1:06 a.m.