gooogle: A group regularized fit to the zero inflated count data.

Description Usage Arguments Details Value Examples

View source: R/Gooogle.R

Description

Fit zero inflated count data with a group regularization algorithm.

Usage

1
gooogle(data,xvars,zvars,yvar,group=1:ncol(data),samegrp.overlap=T,penalty=c("grLasso", "grMCP", "grSCAD", "gBridge"),dist=c("poisson","negbin"), nlambda=100, lambda,lambda.min=ifelse((nrow(data[,unique(c(xvars,zvars))])>ncol(data[,unique(c(xvars,zvars))])),1e-4,.05),lambda.max, crit="BIC",alpha=1, eps=.001, max.iter=1000, gmax=length(unique(group)),gamma=ifelse(penalty=="gBridge",0.5,ifelse(penalty == "grSCAD", 4, 3)), warn=TRUE)

Arguments

data

The data frame or matrix consisting of outcome and predictors.

xvars

The vector of variable names to be included in count model.

zvars

The vector of variable names for excess zero model.

yvar

The outcome variable name.

group

The vector of integers describing the grouping of the coefficients. For greatest efficiency and least ambiguity, it is best if group is a vector of consecutive integers. If there are coefficientss to be included in the model without being penalized, assign them to group 0 (or "0").

samegrp.overlap

A logical argument. If TRUE (default) same grouping indices will be assigned to shared predictors in the count and degenerate distribution.

penalty

The penalty to be applied in the model. For group level selection, one of "grLasso", "grMCP" or "grSCAD". For bi-level selection "gBridge" can be specified.

dist

The distribution for count model - "poisson" for poisson or "negbin" for negative binomial.

nlambda

The number of lambda values. Default is 100.

lambda

A user specified sequence of lambda values.

lambda.min

The smallest value for lambda, as a fraction of lambda.max. Default is .0001 if the number of observations is larger than the number of covariates and .05 otherwise.

lambda.max

The maximum value for lambda (only needed for gBridge penalty).

crit

The selection criteria for the best model. It can either be "AIC" or BIC (default).

alpha

The tuning parameter for the balance between the group penalty and the L2 penalty, as in grpreg. Default value is 1.

eps

The convergence threshhold, as in grpreg.

max.iter

Maximum number of iterations allowed.

gmax

Maximum number of non-zero groups allowed.

gamma

Tuning parameter of group MCP/SCAD. Default is 3 for MCP and 4 for SCAD.

warn

A logical argument indicating whether this function gives warning in case of convergence issue.

Details

The algorithm fits zero inflated count data to conduct variable selection in the presence of intrinsic grouping structure in the predictor set. Group wise penalties are considered for both count and zero abundance part of the mixture model where the likelihood is optimized using group level or bi-level co-ordinate descent algorithms.

Value

A list containing the following components is returned

coefficients

A list with two sets of coefficients corresponding to count and zero inflation parts of the mixture model.

aic

The AIC of the selected model.

bic

The BIC of the selected model.

loglik

The log-likelihood of the selected model.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## Not run: 
## Auto Insurance Claim Data
library(HDtweedie)
data("auto")
y<-auto$y
y<-round(y)
x<-auto$x
data<-cbind.data.frame(y,x)
group=c(rep(1,5),rep(2,7),rep(3,4),rep(4:14,each=3),15:21)
yvar<-names(data)[1]
xvars<-names(data)[-1]
zvars<-xvars

## ZIP regression
fit.poisson<-gooogle(data=data,yvar=yvar,xvars=xvars,zvars=zvars,group=group,samegrp.overlap=T,dist="poisson",penalty="gBridge")
fit.poisson$aic

## ZINB regression
fit.negbin<-gooogle(data=data,yvar=yvar,xvars=xvars,zvars=zvars,group=group,samegrp.overlap=T,dist="negbin",penalty="gBridge")
fit.negbin$aic

## End(Not run)

himelmallick/Gooogle documentation built on July 24, 2019, 1:52 a.m.