zanegbinomial: Zero-Altered Negative Binomial Distribution

Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

View source: R/family.zeroinf.R

Description

Fits a zero-altered negative binomial distribution based on a conditional model involving a binomial distribution and a positive-negative binomial distribution.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
zanegbinomial(zero = "size", type.fitted = c("mean", "munb", "pobs0"),
              mds.min = 1e-3, nsimEIM = 500, cutoff.prob = 0.999,
              eps.trig = 1e-7, max.support = 4000, max.chunk.MB = 30,
              lpobs0 = "logitlink", lmunb = "loglink", lsize = "loglink",
              imethod = 1, ipobs0 = NULL,
              imunb = NULL, iprobs.y = NULL, gprobs.y = (0:9)/10,
              isize = NULL, gsize.mux = exp(c(-30, -20, -15, -10, -6:3)))
zanegbinomialff(lmunb = "loglink", lsize = "loglink", lonempobs0 = "logitlink",
                type.fitted = c("mean", "munb", "pobs0", "onempobs0"),
                isize = NULL, ionempobs0 = NULL, zero = c("size",
                "onempobs0"), mds.min = 1e-3, iprobs.y = NULL, gprobs.y = (0:9)/10,
                cutoff.prob = 0.999, eps.trig = 1e-7, max.support = 4000,
                max.chunk.MB = 30, gsize.mux = exp(c(-30, -20, -15, -10, -6:3)),
                imethod = 1, imunb = NULL,
                nsimEIM = 500)

Arguments

lpobs0

Link function for the parameter pobs0, called pobs0 here. See Links for more choices.

lmunb

Link function applied to the munb parameter, which is the mean munb of an ordinary negative binomial distribution. See Links for more choices.

lsize

Parameter link function applied to the reciprocal of the dispersion parameter, called k. That is, as k increases, the variance of the response decreases. See Links for more choices.

type.fitted

See CommonVGAMffArguments and fittedvlm for information.

lonempobs0, ionempobs0

Corresponding argument for the other parameterization. See details below.

ipobs0, imunb, isize

Optional initial values for pobs0 and munb and k. If given then it is okay to give one value for each response/species by inputting a vector whose length is the number of columns of the response matrix.

zero

Specifies which of the three linear predictors are modelled as intercept-only. All parameters can be modelled as a function of the explanatory variables by setting zero = NULL (not recommended). A negative value means that the value is recycled, e.g., setting -3 means all k are intercept-only for zanegbinomial. See CommonVGAMffArguments for more information.

nsimEIM, imethod

See CommonVGAMffArguments.

iprobs.y, gsize.mux, gprobs.y

See negbinomial.

cutoff.prob, eps.trig

See negbinomial.

mds.min, max.support, max.chunk.MB

See negbinomial.

Details

The response Y is zero with probability pobs0, or Y has a positive-negative binomial distribution with probability 1-pobs0. Thus 0 < pobs0 < 1, which is modelled as a function of the covariates. The zero-altered negative binomial distribution differs from the zero-inflated negative binomial distribution in that the former has zeros coming from one source, whereas the latter has zeros coming from the negative binomial distribution too. The zero-inflated negative binomial distribution is implemented in the VGAM package. Some people call the zero-altered negative binomial a hurdle model.

For one response/species, by default, the three linear/additive predictors for zanegbinomial() are (logit(pobs0), log(munb), log(k))^T. This vector is recycled for multiple species.

The VGAM family function zanegbinomialff() has a few changes compared to zanegbinomial(). These are: (i) the order of the linear/additive predictors is switched so the negative binomial mean comes first; (ii) argument onempobs0 is now 1 minus the probability of an observed 0, i.e., the probability of the positive negative binomial distribution, i.e., onempobs0 is 1-pobs0; (iii) argument zero has a new default so that the pobs0 is intercept-only by default. Now zanegbinomialff() is generally recommended over zanegbinomial(). Both functions implement Fisher scoring and can handle multiple responses.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

The fitted.values slot of the fitted object, which should be extracted by the generic function fitted, returns the mean mu (default) which is given by

mu = (1-pobs0) * munb / [1 - (k/(k+munb))^k].

If type.fitted = "pobs0" then pobs0 is returned.

Warning

This family function is fragile; it inherits the same difficulties as posnegbinomial. Convergence for this VGAM family function seems to depend quite strongly on providing good initial values.

This VGAM family function is computationally expensive and usually runs slowly; setting trace = TRUE is useful for monitoring convergence.

Inference obtained from summary.vglm and summary.vgam may or may not be correct. In particular, the p-values, standard errors and degrees of freedom may need adjustment. Use simulation on artificial data to check that these are reasonable.

Note

Note this family function allows pobs0 to be modelled as functions of the covariates provided zero is set correctly. It is a conditional model, not a mixture model. Simulated Fisher scoring is the algorithm.

This family function effectively combines posnegbinomial and binomialff into one family function.

This family function can handle multiple responses, e.g., more than one species.

Author(s)

T. W. Yee

References

Welsh, A. H., Cunningham, R. B., Donnelly, C. F. and Lindenmayer, D. B. (1996). Modelling the abundances of rare species: statistical models for counts with extra zeros. Ecological Modelling, 88, 297–308.

Yee, T. W. (2014). Reduced-rank vector generalized linear models with two linear predictors. Computational Statistics and Data Analysis, 71, 889–902.

See Also

posnegbinomial, Gaitnbinom, negbinomial, binomialff, zinegbinomial, zipoisson, spikeplot, dnbinom, CommonVGAMffArguments, simulate.vlm.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
zdata <- data.frame(x2 = runif(nn <- 2000))
zdata <- transform(zdata, pobs0 = logitlink(-1 + 2*x2, inverse = TRUE))
zdata <- transform(zdata,
         y1 = rzanegbin(nn, munb = exp(0+2*x2), size = exp(1), pobs0 = pobs0),
         y2 = rzanegbin(nn, munb = exp(1+2*x2), size = exp(1), pobs0 = pobs0))
with(zdata, table(y1))
with(zdata, table(y2))

fit <- vglm(cbind(y1, y2) ~ x2, zanegbinomial, data = zdata, trace = TRUE)
coef(fit, matrix = TRUE)
head(fitted(fit))
head(predict(fit))

## End(Not run)

Example output

Loading required package: stats4
Loading required package: splines
y1
  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  16  17  18  25 
992 331 225 149  98  57  49  32  20   9   9   6   7   4   5   1   4   1   1 
y2
   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
1020   94  120   97   94   84   67   65   59   43   24   27   35   22   19   18 
  16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   32 
  16   10    7    8   10    7   11    3    4    4    2    2    6    4    2    4 
  33   35   36   38   39   41   43   53 
   2    1    2    1    2    1    2    1 
VGLM    linear loop  1 :  loglikelihood = -7188.78901
VGLM    linear loop  2 :  loglikelihood = -7113.82114
VGLM    linear loop  3 :  loglikelihood = -7097.72228
VGLM    linear loop  4 :  loglikelihood = -7097.34413
VGLM    linear loop  5 :  loglikelihood = -7097.34276
VGLM    linear loop  6 :  loglikelihood = -7097.34276
            logitlink(pobs01) loglink(munb1) loglink(size1) logitlink(pobs02)
(Intercept)         -1.054113     -0.1321272      0.9724904        -0.9930123
x2                   2.073762      2.1546895      0.0000000         2.0734418
            loglink(munb2) loglink(size2)
(Intercept)       1.011341       1.058492
x2                1.999670       0.000000
        y1       y2
1 1.504208 3.561921
2 1.262180 2.534755
3 1.373077 3.038723
4 1.289414 2.664309
5 1.690789 4.195148
6 1.644859 4.049977
     logitlink(pobs01) loglink(munb1) loglink(size1) logitlink(pobs02)
[1,]        -0.1413167     0.81628993      0.9724904       -0.08035749
[2,]        -0.8889754     0.03945442      0.9724904       -0.82790058
[3,]        -0.5009283     0.44264472      0.9724904       -0.43991353
[4,]        -0.7829716     0.14959493      0.9724904       -0.72191319
[5,]         0.2816960     1.25581047      0.9724904        0.34258988
[6,]         0.1836250     1.15391226      0.9724904        0.24453399
     loglink(munb2) loglink(size2)
[1,]       1.891524       1.058492
[2,]       1.170578       1.058492
[3,]       1.544760       1.058492
[4,]       1.272794       1.058492
[5,]       2.299423       1.058492
[6,]       2.204856       1.058492

VGAM documentation built on Jan. 16, 2021, 5:21 p.m.