zibinomial: Zero-Inflated Binomial Distribution Family Function

Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

View source: R/family.zeroinf.R

Description

Fits a zero-inflated binomial distribution by maximum likelihood estimation.

Usage

1
2
3
4
5
6
7
8
zibinomial(lpstr0 = "logitlink", lprob = "logitlink",
           type.fitted = c("mean", "prob", "pobs0", "pstr0", "onempstr0"),
           ipstr0 = NULL, zero = NULL, multiple.responses = FALSE,
           imethod = 1)
zibinomialff(lprob = "logitlink", lonempstr0 = "logitlink",
             type.fitted = c("mean", "prob", "pobs0", "pstr0", "onempstr0"),
             ionempstr0 = NULL, zero = "onempstr0",
             multiple.responses = FALSE, imethod = 1)

Arguments

lpstr0, lprob

Link functions for the parameter phi and the usual binomial probability prob parameter. See Links for more choices. For the zero-deflated model see below.

type.fitted

See CommonVGAMffArguments and fittedvlm.

ipstr0

Optional initial values for phi, whose values must lie between 0 and 1. The default is to compute an initial value internally. If a vector then recyling is used.

lonempstr0, ionempstr0

Corresponding arguments for the other parameterization. See details below.

multiple.responses

Logical. Currently it must be FALSE to mean the function does not handle multiple responses. This is to remain compatible with the same argument in binomialff.

zero, imethod

See CommonVGAMffArguments for information. Argument zero changed its default value for version 0.9-2.

Details

These functions are based on

P(Y=0) = phi + (1- phi) * (1-prob)^N,

for y=0, and

P(Y=y) = (1-phi) * choose(N,Ny) * prob^(N*y) * (1-prob)^(N*(1-y)).

for y=1/N,2/N,…,1. That is, the response is a sample proportion out of N trials, and the argument size in rzibinom is N here. The parameter phi is the probability of a structural zero, and it satisfies 0 < phi < 1. The mean of Y is E(Y) = (1-phi) * prob and these are returned as the fitted values by default. By default, the two linear/additive predictors for zibinomial() are (logit(phi), logit(prob))^T.

The VGAM family function zibinomialff() has a few changes compared to zibinomial(). These are: (i) the order of the linear/additive predictors is switched so the binomial probability comes first; (ii) argument onempstr0 is now 1 minus the probability of a structural zero, i.e., the probability of the parent (binomial) component, i.e., onempstr0 is 1-pstr0; (iii) argument zero has a new default so that the onempstr0 is intercept-only by default. Now zibinomialff() is generally recommended over zibinomial(). Both functions implement Fisher scoring.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm and vgam.

Warning

Numerical problems can occur. Half-stepping is not uncommon. If failure to converge occurs, make use of the argument ipstr0 or ionempstr0, or imethod.

Note

The response variable must have one of the formats described by binomialff, e.g., a factor or two column matrix or a vector of sample proportions with the weights argument specifying the values of N.

To work well, one needs large values of N and prob>0, i.e., the larger N and prob are, the better. If N = 1 then the model is unidentifiable since the number of parameters is excessive.

Setting stepsize = 0.5, say, may aid convergence.

Estimated probabilities of a structural zero and an observed zero are returned, as in zipoisson.

The zero-deflated binomial distribution might be fitted by setting lpstr0 = identitylink, albeit, not entirely reliably. See zipoisson for information that can be applied here. Else try the zero-altered binomial distribution (see zabinomial).

Author(s)

T. W. Yee

References

Welsh, A. H., Lindenmayer, D. B. and Donnelly, C. F. (2013). Fitting and interpreting occupancy models. PLOS One, 8, 1–21.

See Also

rzibinom, binomialff, posbinomial, spikeplot, Binomial.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
size <- 10  # Number of trials; N in the notation above
nn <- 200
zdata <- data.frame(pstr0 = logitlink( 0, inverse = TRUE),  # 0.50
                    mubin = logitlink(-1, inverse = TRUE),  # Mean of usual binomial
                    sv    = rep(size, length = nn))
zdata <- transform(zdata,
                   y = rzibinom(nn, size = sv, prob = mubin, pstr0 = pstr0))
with(zdata, table(y))
fit <- vglm(cbind(y, sv - y) ~ 1, zibinomialff, data = zdata, trace = TRUE)
fit <- vglm(cbind(y, sv - y) ~ 1, zibinomialff, data = zdata, trace = TRUE,
            stepsize = 0.5)

coef(fit, matrix = TRUE)
Coef(fit)  # Useful for intercept-only models
head(fitted(fit, type = "pobs0"))  # Estimate of P(Y = 0)
head(fitted(fit))
with(zdata, mean(y))  # Compare this with fitted(fit)
summary(fit)

Example output

Loading required package: stats4
Loading required package: splines
y
  0   1   2   3   4   5   6 
108  20  23  28  12   5   4 
VGLM    linear loop  1 :  loglikelihood = -342.55889
VGLM    linear loop  2 :  loglikelihood = -437.31768
Taking a modified step.
VGLM    linear loop  2 :  loglikelihood = -306.12565
VGLM    linear loop  3 :  loglikelihood = -293.52848
VGLM    linear loop  4 :  loglikelihood = -287.82025
VGLM    linear loop  5 :  loglikelihood = -287.78906
VGLM    linear loop  6 :  loglikelihood = -287.78906
Taking a modified step.
VGLM    linear loop  2 :  loglikelihood = -306.12565
Taking a modified step.
VGLM    linear loop  3 :  loglikelihood = -290.10584
Taking a modified step.
VGLM    linear loop  4 :  loglikelihood = -288.32519
Taking a modified step.
VGLM    linear loop  5 :  loglikelihood = -287.91978
Taking a modified step.
VGLM    linear loop  6 :  loglikelihood = -287.82142
Taking a modified step.
VGLM    linear loop  7 :  loglikelihood = -287.79712
Taking a modified step.
VGLM    linear loop  8 :  loglikelihood = -287.79107
Taking a modified step.
VGLM    linear loop  9 :  loglikelihood = -287.78957
Taking a modified step.
VGLM    linear loop  10 :  loglikelihood = -287.78919
Taking a modified step.
VGLM    linear loop  11 :  loglikelihood = -287.7891
Taking a modified step.
VGLM    linear loop  12 :  loglikelihood = -287.78907
Taking a modified step.
VGLM    linear loop  13 :  loglikelihood = -287.78907
Taking a modified step.
VGLM    linear loop  14 :  loglikelihood = -287.78906
            logitlink(prob) logitlink(onempstr0)
(Intercept)       -1.076452          -0.05665979
     prob onempstr0 
0.2541780 0.4858388 
      [,1]
1 0.540034
2 0.540034
3 0.540034
4 0.540034
5 0.540034
6 0.540034
       [,1]
1 0.1234895
2 0.1234895
3 0.1234895
4 0.1234895
5 0.1234895
6 0.1234895
[1] 1.235

Call:
vglm(formula = cbind(y, sv - y) ~ 1, family = zibinomialff, data = zdata, 
    trace = TRUE, stepsize = 0.5)

Pearson residuals:
                         Min       1Q   Median       3Q   Max
logitlink(prob)      -1.8369 -0.09063 -0.09063 -0.09063 3.931
logitlink(onempstr0) -0.9184 -0.91844 -0.91844  1.04244 1.270

Coefficients: 
              Estimate Std. Error z value Pr(>|z|)    
(Intercept):1 -1.07645    0.08196 -13.134   <2e-16 ***
(Intercept):2 -0.05666    0.15075  -0.376    0.707    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Names of linear predictors: logitlink(prob), logitlink(onempstr0)

Log-likelihood: -287.7891 on 398 degrees of freedom

Number of Fisher scoring iterations: 14 

No Hauck-Donner effect found in any of the estimates

VGAM documentation built on Jan. 16, 2021, 5:21 p.m.