# zibinomial: Zero-Inflated Binomial Distribution Family Function In VGAM: Vector Generalized Linear and Additive Models

## Description

Fits a zero-inflated binomial distribution by maximum likelihood estimation.

## Usage

 ```1 2 3 4 5 6 7 8``` ```zibinomial(lpstr0 = "logitlink", lprob = "logitlink", type.fitted = c("mean", "prob", "pobs0", "pstr0", "onempstr0"), ipstr0 = NULL, zero = NULL, multiple.responses = FALSE, imethod = 1) zibinomialff(lprob = "logitlink", lonempstr0 = "logitlink", type.fitted = c("mean", "prob", "pobs0", "pstr0", "onempstr0"), ionempstr0 = NULL, zero = "onempstr0", multiple.responses = FALSE, imethod = 1) ```

## Arguments

 `lpstr0, lprob` Link functions for the parameter phi and the usual binomial probability prob parameter. See `Links` for more choices. For the zero-deflated model see below.
 `type.fitted` See `CommonVGAMffArguments` and `fittedvlm`. `ipstr0` Optional initial values for phi, whose values must lie between 0 and 1. The default is to compute an initial value internally. If a vector then recyling is used. `lonempstr0, ionempstr0` Corresponding arguments for the other parameterization. See details below.
 `multiple.responses` Logical. Currently it must be `FALSE` to mean the function does not handle multiple responses. This is to remain compatible with the same argument in `binomialff`. `zero, imethod` See `CommonVGAMffArguments` for information. Argument `zero` changed its default value for version 0.9-2.

## Details

These functions are based on

P(Y=0) = phi + (1- phi) * (1-prob)^N,

for y=0, and

P(Y=y) = (1-phi) * choose(N,Ny) * prob^(N*y) * (1-prob)^(N*(1-y)).

for y=1/N,2/N,…,1. That is, the response is a sample proportion out of N trials, and the argument `size` in `rzibinom` is N here. The parameter phi is the probability of a structural zero, and it satisfies 0 < phi < 1. The mean of Y is E(Y) = (1-phi) * prob and these are returned as the fitted values by default. By default, the two linear/additive predictors for `zibinomial()` are (logit(phi), logit(prob))^T.

The VGAM family function `zibinomialff()` has a few changes compared to `zibinomial()`. These are: (i) the order of the linear/additive predictors is switched so the binomial probability comes first; (ii) argument `onempstr0` is now 1 minus the probability of a structural zero, i.e., the probability of the parent (binomial) component, i.e., `onempstr0` is `1-pstr0`; (iii) argument `zero` has a new default so that the `onempstr0` is intercept-only by default. Now `zibinomialff()` is generally recommended over `zibinomial()`. Both functions implement Fisher scoring.

## Value

An object of class `"vglmff"` (see `vglmff-class`). The object is used by modelling functions such as `vglm` and `vgam`.

## Warning

Numerical problems can occur. Half-stepping is not uncommon. If failure to converge occurs, make use of the argument `ipstr0` or `ionempstr0`, or `imethod`.

## Note

The response variable must have one of the formats described by `binomialff`, e.g., a factor or two column matrix or a vector of sample proportions with the `weights` argument specifying the values of N.

To work well, one needs large values of N and prob>0, i.e., the larger N and prob are, the better. If N = 1 then the model is unidentifiable since the number of parameters is excessive.

Setting `stepsize = 0.5`, say, may aid convergence.

Estimated probabilities of a structural zero and an observed zero are returned, as in `zipoisson`.

The zero-deflated binomial distribution might be fitted by setting `lpstr0 = identitylink`, albeit, not entirely reliably. See `zipoisson` for information that can be applied here. Else try the zero-altered binomial distribution (see `zabinomial`).

T. W. Yee

## References

Welsh, A. H., Lindenmayer, D. B. and Donnelly, C. F. (2013). Fitting and interpreting occupancy models. PLOS One, 8, 1–21.

`rzibinom`, `binomialff`, `posbinomial`, `spikeplot`, `Binomial`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18``` ```size <- 10 # Number of trials; N in the notation above nn <- 200 zdata <- data.frame(pstr0 = logitlink( 0, inverse = TRUE), # 0.50 mubin = logitlink(-1, inverse = TRUE), # Mean of usual binomial sv = rep(size, length = nn)) zdata <- transform(zdata, y = rzibinom(nn, size = sv, prob = mubin, pstr0 = pstr0)) with(zdata, table(y)) fit <- vglm(cbind(y, sv - y) ~ 1, zibinomialff, data = zdata, trace = TRUE) fit <- vglm(cbind(y, sv - y) ~ 1, zibinomialff, data = zdata, trace = TRUE, stepsize = 0.5) coef(fit, matrix = TRUE) Coef(fit) # Useful for intercept-only models head(fitted(fit, type = "pobs0")) # Estimate of P(Y = 0) head(fitted(fit)) with(zdata, mean(y)) # Compare this with fitted(fit) summary(fit) ```

### Example output

```Loading required package: stats4
y
0   1   2   3   4   5   6
108  20  23  28  12   5   4
VGLM    linear loop  1 :  loglikelihood = -342.55889
VGLM    linear loop  2 :  loglikelihood = -437.31768
Taking a modified step.
VGLM    linear loop  2 :  loglikelihood = -306.12565
VGLM    linear loop  3 :  loglikelihood = -293.52848
VGLM    linear loop  4 :  loglikelihood = -287.82025
VGLM    linear loop  5 :  loglikelihood = -287.78906
VGLM    linear loop  6 :  loglikelihood = -287.78906
Taking a modified step.
VGLM    linear loop  2 :  loglikelihood = -306.12565
Taking a modified step.
VGLM    linear loop  3 :  loglikelihood = -290.10584
Taking a modified step.
VGLM    linear loop  4 :  loglikelihood = -288.32519
Taking a modified step.
VGLM    linear loop  5 :  loglikelihood = -287.91978
Taking a modified step.
VGLM    linear loop  6 :  loglikelihood = -287.82142
Taking a modified step.
VGLM    linear loop  7 :  loglikelihood = -287.79712
Taking a modified step.
VGLM    linear loop  8 :  loglikelihood = -287.79107
Taking a modified step.
VGLM    linear loop  9 :  loglikelihood = -287.78957
Taking a modified step.
VGLM    linear loop  10 :  loglikelihood = -287.78919
Taking a modified step.
VGLM    linear loop  11 :  loglikelihood = -287.7891
Taking a modified step.
VGLM    linear loop  12 :  loglikelihood = -287.78907
Taking a modified step.
VGLM    linear loop  13 :  loglikelihood = -287.78907
Taking a modified step.
VGLM    linear loop  14 :  loglikelihood = -287.78906
(Intercept)       -1.076452          -0.05665979
prob onempstr0
0.2541780 0.4858388
[,1]
1 0.540034
2 0.540034
3 0.540034
4 0.540034
5 0.540034
6 0.540034
[,1]
1 0.1234895
2 0.1234895
3 0.1234895
4 0.1234895
5 0.1234895
6 0.1234895
 1.235

Call:
vglm(formula = cbind(y, sv - y) ~ 1, family = zibinomialff, data = zdata,
trace = TRUE, stepsize = 0.5)

Pearson residuals:
Min       1Q   Median       3Q   Max
logitlink(prob)      -1.8369 -0.09063 -0.09063 -0.09063 3.931
logitlink(onempstr0) -0.9184 -0.91844 -0.91844  1.04244 1.270

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1 -1.07645    0.08196 -13.134   <2e-16 ***
(Intercept):2 -0.05666    0.15075  -0.376    0.707
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1