View source: R/family.binomial.R

betabinomialff | R Documentation |

Fits a beta-binomial distribution by maximum likelihood estimation. The two parameters here are the shape parameters of the underlying beta distribution.

betabinomialff(lshape1 = "loglink", lshape2 = "loglink", ishape1 = 1, ishape2 = NULL, imethod = 1, ishrinkage = 0.95, nsimEIM = NULL, zero = NULL)

`lshape1, lshape2` |
Link functions for the two (positive) shape parameters
of the beta distribution.
See |

`ishape1, ishape2` |
Initial value for the shape parameters.
The first must be positive, and is recyled to the necessary
length. The second is optional. If a failure to converge
occurs, try assigning a different value to |

`zero` |
Can be
an integer specifying which linear/additive predictor
is to be modelled as an intercept only. If assigned, the
single value should be either |

`ishrinkage, nsimEIM, imethod` |
See |

There are several parameterizations of the beta-binomial
distribution. This family function directly models the two
shape parameters of the associated beta distribution rather than
the probability of success (however, see **Note** below).
The model can be written
*T|P=p ~ Binomial(N,p)*
where *P* has a beta distribution with shape parameters
*alpha* and *beta*. Here,
*N* is the number of trials (e.g., litter size),
*T=NY* is the number of successes, and
*p* is the probability of a success (e.g., a malformation).
That is, *Y* is the *proportion* of successes. Like
`binomialff`

, the fitted values are the
estimated probability
of success (i.e., *E[Y]* and not *E[T]*)
and the prior weights *N* are attached separately on the
object in a slot.

The probability function is

*
P(T=t) = choose(N,t) B(alpha+t, beta+N-t) / B(alpha, beta)*

where *t=0,1,…,N*, and *B* is the beta function
with shape parameters *alpha* and *beta*.
Recall *Y = T/N* is the real response being modelled.

The default model
is *eta1 = log(alpha)*
and *eta2 = log(beta)* because both
parameters are positive.
The mean (of *Y*) is
*p = mu = alpha / (alpha + beta)*
and the variance (of *Y*) is
*mu(1-mu)(1+(N-1)rho)/N*.
Here, the correlation *rho* is given by
*1/(1 + alpha + beta)*
and is the correlation between the *N* individuals
within a litter. A *litter effect* is typically reflected
by a positive value of *rho*. It is known as the
*over-dispersion parameter*.

This family function uses Fisher scoring. The two diagonal
elements of the second-order expected
derivatives with respect to *alpha* and
*beta* are computed numerically, which may
fail for large *alpha*, *beta*,
*N* or else take a long time.

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions
such as `vglm`

.

Suppose `fit`

is a fitted beta-binomial model. Then
`fit@y`

(better: `depvar(fit)`

) contains the sample
proportions *y*, `fitted(fit)`

returns estimates of
*E(Y)*, and `weights(fit, type = "prior")`

returns
the number of trials *N*.

This family function is prone to numerical difficulties due to
the expected information matrices not being positive-definite
or ill-conditioned over some regions of the parameter space.
If problems occur try setting `ishape1`

to be some other
positive value, using `ishape2`

and/or setting ```
zero
= 2
```

.

This family function may be renamed in the future.
See the warnings in `betabinomial`

.

This function processes the input in the same way
as `binomialff`

. But it does not handle
the case *N=1* very well because there are two
parameters to estimate, not one, for each row of the input.
Cases where *N=1* can be omitted via the
`subset`

argument of `vglm`

.

Although the two linear/additive predictors given above are
in terms of *alpha* and *beta*, basic
algebra shows that the default amounts to fitting a logit
link to the probability of success; subtracting the second
linear/additive predictor from the first gives that logistic
regression linear/additive predictor. That is, *logit(p) = eta1 - eta2*. This is illustated
in one of the examples below.

The *extended* beta-binomial distribution of Prentice
(1986) is currently not implemented in the VGAM package
as it has range-restrictions for the correlation parameter that
are currently too difficult to handle in this package.

T. W. Yee

Moore, D. F. and Tsiatis, A. (1991).
Robust estimation of the variance in moment methods for
extra-binomial and extra-Poisson variation.
*Biometrics*,
**47**, 383–401.

Prentice, R. L. (1986).
Binary regression using an extended beta-binomial distribution,
with discussion of correlation induced by
covariate measurement errors.
*Journal of the American Statistical Association*,
**81**, 321–327.

`betabinomial`

,
`Betabinom`

,
`binomialff`

,
`betaff`

,
`dirmultinomial`

,
`lirat`

,
`simulate.vlm`

.

# Example 1 N <- 10; s1 <- exp(1); s2 <- exp(2) y <- rbetabinom.ab(n = 100, size = N, shape1 = s1, shape2 = s2) fit <- vglm(cbind(y, N-y) ~ 1, betabinomialff, trace = TRUE) coef(fit, matrix = TRUE) Coef(fit) head(fit@misc$rho) # The correlation parameter head(cbind(depvar(fit), weights(fit, type = "prior"))) # Example 2 fit <- vglm(cbind(R, N-R) ~ 1, betabinomialff, data = lirat, trace = TRUE, subset = N > 1) coef(fit, matrix = TRUE) Coef(fit) fit@misc$rho # The correlation parameter t(fitted(fit)) t(depvar(fit)) t(weights(fit, type = "prior")) # A "loglink" link for the 2 shape params is a logistic regression: all.equal(c(fitted(fit)), as.vector(logitlink(predict(fit)[, 1] - predict(fit)[, 2], inverse = TRUE))) # Example 3, which is more complicated lirat <- transform(lirat, fgrp = factor(grp)) summary(lirat) # Only 5 litters in group 3 fit2 <- vglm(cbind(R, N-R) ~ fgrp + hb, betabinomialff(zero = 2), data = lirat, trace = TRUE, subset = N > 1) coef(fit2, matrix = TRUE) coef(fit2, matrix = TRUE)[, 1] - coef(fit2, matrix = TRUE)[, 2] # logitlink(p) ## Not run: with(lirat, plot(hb[N > 1], fit2@misc$rho, xlab = "Hemoglobin", ylab = "Estimated rho", pch = as.character(grp[N > 1]), col = grp[N > 1])) ## End(Not run) ## Not run: # cf. Figure 3 of Moore and Tsiatis (1991) with(lirat, plot(hb, R / N, pch = as.character(grp), col = grp, xlab = "Hemoglobin level", ylab = "Proportion Dead", las = 1, main = "Fitted values (lines)")) smalldf <- with(lirat, lirat[N > 1, ]) for (gp in 1:4) { xx <- with(smalldf, hb[grp == gp]) yy <- with(smalldf, fitted(fit2)[grp == gp]) ooo <- order(xx) lines(xx[ooo], yy[ooo], col = gp) } ## End(Not run)

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.