# zapoisson: Zero-Altered Poisson Distribution In VGAM: Vector Generalized Linear and Additive Models

## Description

Fits a zero-altered Poisson distribution based on a conditional model involving a Bernoulli distribution and a positive-Poisson distribution.

## Usage

 1 2 3 4 5 6 7 8 zapoisson(lpobs0 = "logitlink", llambda = "loglink", type.fitted = c("mean", "lambda", "pobs0", "onempobs0"), imethod = 1, ipobs0 = NULL, ilambda = NULL, ishrinkage = 0.95, probs.y = 0.35, zero = NULL) zapoissonff(llambda = "loglink", lonempobs0 = "logitlink", type.fitted = c("mean", "lambda", "pobs0", "onempobs0"), imethod = 1, ilambda = NULL, ionempobs0 = NULL, ishrinkage = 0.95, probs.y = 0.35, zero = "onempobs0")

## Arguments

 lpobs0 Link function for the parameter pobs0, called pobs0 here. See Links for more choices. llambda Link function for the usual lambda parameter. See Links for more choices. type.fitted See CommonVGAMffArguments and fittedvlm for information. lonempobs0 Corresponding argument for the other parameterization. See details below.
 imethod, ipobs0, ionempobs0, ilambda, ishrinkage See CommonVGAMffArguments for information. probs.y, zero See CommonVGAMffArguments for information.

## Details

The response Y is zero with probability pobs0, else Y has a positive-Poisson(lambda) distribution with probability 1-pobs0. Thus 0 < pobs0 < 1, which is modelled as a function of the covariates. The zero-altered Poisson distribution differs from the zero-inflated Poisson distribution in that the former has zeros coming from one source, whereas the latter has zeros coming from the Poisson distribution too. Some people call the zero-altered Poisson a hurdle model.

For one response/species, by default, the two linear/additive predictors for zapoisson() are (logit(pobs0), log(lambda))^T.

The VGAM family function zapoissonff() has a few changes compared to zapoisson(). These are: (i) the order of the linear/additive predictors is switched so the Poisson mean comes first; (ii) argument onempobs0 is now 1 minus the probability of an observed 0, i.e., the probability of the positive Poisson distribution, i.e., onempobs0 is 1-pobs0; (iii) argument zero has a new default so that the onempobs0 is intercept-only by default. Now zapoissonff() is generally recommended over zapoisson(). Both functions implement Fisher scoring and can handle multiple responses.

## Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

The fitted.values slot of the fitted object, which should be extracted by the generic function fitted, returns the mean mu (default) which is given by

mu = (1-pobs0) * lambda / [1 - exp(-lambda)].

If type.fitted = "pobs0" then pobs0 is returned.

## Note

There are subtle differences between this family function and zipoisson and yip88. In particular, zipoisson is a mixture model whereas zapoisson() and yip88 are conditional models.

Note this family function allows pobs0 to be modelled as functions of the covariates.

This family function effectively combines pospoisson and binomialff into one family function. This family function can handle multiple responses, e.g., more than one species.

It is recommended that Gaitpois be used, e.g., rgaitpois(nn, lambda, pobs.mlm = pobs0, alt.mlm = 0) instead of rzapois(nn, lambda, pobs0 = pobs0).

T. W. Yee

## References

Welsh, A. H., Cunningham, R. B., Donnelly, C. F. and Lindenmayer, D. B. (1996). Modelling the abundances of rare species: statistical models for counts with extra zeros. Ecological Modelling, 88, 297–308.

Angers, J-F. and Biswas, A. (2003). A Bayesian analysis of zero-inflated generalized Poisson model. Computational Statistics & Data Analysis, 42, 37–46.

Yee, T. W. (2014). Reduced-rank vector generalized linear models with two linear predictors. Computational Statistics and Data Analysis, 71, 889–902.

## Examples

 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 zdata <- data.frame(x2 = runif(nn <- 1000)) zdata <- transform(zdata, pobs0 = logitlink( -1 + 1*x2, inverse = TRUE), lambda = loglink(-0.5 + 2*x2, inverse = TRUE)) zdata <- transform(zdata, y = rgaitpois(nn, lambda, pobs.mlm = pobs0, alt.mlm = 0)) with(zdata, table(y)) fit <- vglm(y ~ x2, zapoisson, data = zdata, trace = TRUE) fit <- vglm(y ~ x2, zapoisson, data = zdata, trace = TRUE, crit = "coef") head(fitted(fit)) head(predict(fit)) head(predict(fit, untransform = TRUE)) coef(fit, matrix = TRUE) summary(fit) # Another example ------------------------------ # Data from Angers and Biswas (2003) abdata <- data.frame(y = 0:7, w = c(182, 41, 12, 2, 2, 0, 0, 1)) abdata <- subset(abdata, w > 0) Abdata <- data.frame(yy = with(abdata, rep(y, w))) fit3 <- vglm(yy ~ 1, zapoisson, data = Abdata, trace = TRUE, crit = "coef") coef(fit3, matrix = TRUE) Coef(fit3) # Estimate lambda (they get 0.6997 with SE 0.1520) head(fitted(fit3), 1) with(Abdata, mean(yy)) # Compare this with fitted(fit3)

### Example output

y
0   1   2   3   4   5   6   7   8   9  11
373 242 169  97  61  30  17   6   3   1   1
VGLM    linear loop  1 :  loglikelihood = -1530.0014
VGLM    linear loop  2 :  loglikelihood = -1520.852
VGLM    linear loop  3 :  loglikelihood = -1520.6892
VGLM    linear loop  4 :  loglikelihood = -1520.6891
VGLM    linear loop  5 :  loglikelihood = -1520.6891
VGLM    linear loop  1 :  coefficients =
-1.15506629, -0.13969254,  1.06503298,  1.67430700
VGLM    linear loop  2 :  coefficients =
-1.00181696, -0.38577208,  0.94150011,  1.91340250
VGLM    linear loop  3 :  coefficients =
-1.00612466, -0.43175560,  0.94670381,  1.96547030
VGLM    linear loop  4 :  coefficients =
-1.00612739, -0.43282828,  0.94670737,  1.96681725
VGLM    linear loop  5 :  coefficients =
-1.00612739, -0.43283452,  0.94670737,  1.96682673
VGLM    linear loop  6 :  coefficients =
-1.00612739, -0.43283456,  0.94670737,  1.96682679
[,1]
1 1.0193511
2 1.4062908
3 2.0502115
4 0.9973532
5 1.5525480
6 1.3912642
1       -0.9405177      -0.2965276
2       -0.4736696       0.6733702
3       -0.1699844       1.3042898
4       -0.9998756      -0.4198461
5       -0.3833394       0.8610353
6       -0.4842181       0.6514552
pobs0    lambda
1 0.2807958 0.7433951
2 0.3837481 1.9608347
3 0.4576059 3.6850711
4 0.2689659 0.6571479
5 0.4053217 2.3656085
6 0.3812566 1.9183304
(Intercept)       -1.0061274      -0.4328346
x2                 0.9467074       1.9668268

Call:
vglm(formula = y ~ x2, family = zapoisson, data = zdata, trace = TRUE,
crit = "coef")

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 -1.00613    0.13938  -7.219 5.25e-13 ***
(Intercept):2 -0.43283    0.08784  -4.927 8.33e-07 ***
x2:1           0.94671    0.23449   4.037 5.41e-05 ***
x2:2           1.96683    0.12358  15.916  < 2e-16 ***
---
Signif. codes:  0***0.001**0.01*0.05.’ 0.1 ‘ ’ 1

Log-likelihood: -1520.689 on 1996 degrees of freedom

Number of Fisher scoring iterations: 6

No Hauck-Donner effect found in any of the estimates

VGLM    linear loop  1 :  coefficients = 1.25650524, 0.12608894
VGLM    linear loop  2 :  coefficients =  1.14002466, -0.14124406
VGLM    linear loop  3 :  coefficients =  1.14356045, -0.16530315
VGLM    linear loop  4 :  coefficients =  1.14356368, -0.16572625
VGLM    linear loop  5 :  coefficients =  1.14356368, -0.16572636
VGLM    linear loop  6 :  coefficients =  1.14356368, -0.16572636