betabinomial: Beta-binomial Distribution Family Function
In VGAM: Vector Generalized Linear and Additive Models

betabinomial

R Documentation

Beta-binomial Distribution Family Function

Description

Fits a beta-binomial distribution by maximum likelihood estimation. The two parameters here are the mean and correlation coefficient.

Usage

betabinomial(lmu = "logitlink", lrho = "logitlink",
   irho = NULL, imethod = 1,
   ishrinkage = 0.95, nsimEIM = NULL, zero = "rho")

Arguments

`lmu`, `lrho`	Link functions applied to the two parameters. See `Links` for more choices. The defaults ensure the parameters remain in `(0,1)`, however, see the warning below. For `lrho`, `log1plink` (with an offset `log(size - 1)` for `\eta_2`) and `cloglink` may be very good choices.
`irho`	Optional initial value for the correlation parameter. If given, it must be in `(0,1)`, and is recyled to the necessary length. Assign this argument a value if a convergence failure occurs. Having `irho = NULL` means an initial value is obtained internally, though this can give unsatisfactory results.
`imethod`	An integer with value `1` or `2` or ..., which specifies the initialization method for `\mu`. If failure to converge occurs try the another value and/or else specify a value for `irho`.
`zero`	Specifyies which linear/additive predictor is to be modelled as an intercept only. If assigned, the single value can be either `1` or `2`. The default is to have a single correlation parameter. To model both parameters as functions of the covariates assign `zero = NULL`. See `CommonVGAMffArguments` for more information.
`ishrinkage`, `nsimEIM`	See `CommonVGAMffArguments` for more information. The argument `ishrinkage` is used only if `imethod = 2`. Using the argument `nsimEIM` may offer large advantages for large values of `N` and/or large data sets.

Details

There are several parameterizations of the beta-binomial distribution. This family function directly models the mean and correlation parameter, i.e., the probability of success. The model can be written T|P=p \sim Binomial(N,p) where P has a beta distribution with shape parameters \alpha and \beta. Here, N is the number of trials (e.g., litter size), T=NY is the number of successes, and p is the probability of a success (e.g., a malformation). That is, Y is the proportion of successes. Like binomialff, the fitted values are the estimated probability of success (i.e., E[Y] and not E[T]) and the prior weights N are attached separately on the object in a slot.

The probability function is

P(T=t) = {N \choose t} \frac{Be(\alpha+t, \beta+N-t)} {Be(\alpha, \beta)}

where t=0,1,\ldots,N, and Be is the beta function with shape parameters \alpha and \beta. Recall Y = T/N is the real response being modelled.

The default model is \eta_1 = logit(\mu) and \eta_2 = logit(\rho) because both parameters lie between 0 and 1. The mean (of Y) is p=\mu=\alpha/(\alpha+\beta) and the variance (of Y) is \mu(1-\mu)(1+(N-1)\rho)/N. Here, the correlation \rho is given by 1/(1 + \alpha + \beta) and is the correlation between the N individuals within a litter. A litter effect is typically reflected by a positive value of \rho. It is known as the over-dispersion parameter.

This family function uses Fisher scoring. Elements of the second-order expected derivatives with respect to \alpha and \beta are computed numerically, which may fail for large \alpha, \beta, N or else take a long time.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm.

Suppose fit is a fitted beta-binomial model. Then depvar(fit) are the sample proportions y, fitted(fit) returns estimates of E(Y), and weights(fit, type = "prior") returns the number of trials N.

Warning

If the estimated rho parameter is close to 0 then a good solution is to use extbetabinomial. Or you could try lrho = "rhobitlink".

This family function is prone to numerical difficulties due to the expected information matrices not being positive-definite or ill-conditioned over some regions of the parameter space. If problems occur try setting irho to some numerical value, nsimEIM = 100, say, or else use etastart argument of vglm, etc.

Note

This function processes the input in the same way as binomialff. But it does not handle the case N=1 very well because there are two parameters to estimate, not one, for each row of the input. Cases where N=1 can be omitted via the subset argument of vglm.

The extended beta-binomial distribution of Prentice (1986) implemented by extbetabinomial is the preferred VGAM family function for BBD regression.

Author(s)

T. W. Yee

References

Moore, D. F. and Tsiatis, A. (1991). Robust estimation of the variance in moment methods for extra-binomial and extra-Poisson variation. Biometrics, 47, 383–401.

Examples

# Example 1
bdata <- data.frame(N = 10, mu = 0.5, rho = 0.8)
bdata <- transform(bdata,
            y = rbetabinom(100, size = N, prob = mu, rho = rho))
fit <- vglm(cbind(y, N-y) ~ 1, betabinomial, bdata, trace = TRUE)
coef(fit, matrix = TRUE)
Coef(fit)
head(cbind(depvar(fit), weights(fit, type = "prior")))


# Example 2
fit <- vglm(cbind(R, N-R) ~ 1, betabinomial, lirat,
            trace = TRUE, subset = N > 1)
coef(fit, matrix = TRUE)
Coef(fit)
t(fitted(fit))
t(depvar(fit))
t(weights(fit, type = "prior"))


# Example 3, which is more complicated
lirat <- transform(lirat, fgrp = factor(grp))
summary(lirat)  # Only 5 litters in group 3
fit2 <- vglm(cbind(R, N-R) ~ fgrp + hb, betabinomial(zero = 2),
             data = lirat, trace = TRUE, subset = N > 1)
coef(fit2, matrix = TRUE)
## Not run:  with(lirat, plot(hb[N > 1], fit2@misc$rho,
         xlab = "Hemoglobin", ylab = "Estimated rho",
         pch = as.character(grp[N > 1]), col = grp[N > 1])) 
## End(Not run)
## Not run:   # cf. Figure 3 of Moore and Tsiatis (1991)
with(lirat, plot(hb, R / N, pch = as.character(grp), col = grp,
         xlab = "Hemoglobin level", ylab = "Proportion Dead",
         main = "Fitted values (lines)", las = 1))
smalldf <- with(lirat, lirat[N > 1, ])
for (gp in 1:4) {
  xx <- with(smalldf, hb[grp == gp])
  yy <- with(smalldf, fitted(fit2)[grp == gp])
  ooo <- order(xx)
  lines(xx[ooo], yy[ooo], col = gp, lwd = 2)
} 
## End(Not run)

VGAM documentation built on Sept. 18, 2024, 9:09 a.m.