# zipebcom: Exchangeable Bivariate cloglog Odds-ratio Model From a... In VGAM: Vector Generalized Linear and Additive Models

## Description

Fits an exchangeable bivariate odds-ratio model to two binary responses with a complementary log-log link. The data are assumed to come from a zero-inflated Poisson distribution that has been converted to presence/absence.

## Usage

 ```1 2 3``` ```zipebcom(lmu12 = "clogloglink", lphi12 = "logitlink", loratio = "loglink", imu12 = NULL, iphi12 = NULL, ioratio = NULL, zero = c("phi12", "oratio"), tol = 0.001, addRidge = 0.001) ```

## Arguments

 `lmu12, imu12` Link function, extra argument and optional initial values for the first (and second) marginal probabilities. Argument `lmu12` should be left alone. Argument `imu12` may be of length 2 (one element for each response). `lphi12` Link function applied to the phi parameter of the zero-inflated Poisson distribution (see `zipoisson`). See `Links` for more choices. `loratio` Link function applied to the odds ratio. See `Links` for more choices. `iphi12, ioratio` Optional initial values for phi and the odds ratio. See `CommonVGAMffArguments` for more details. In general, good initial values (especially for `iphi12`) are often required, therefore use these arguments if convergence failure occurs. If inputted, the value of `iphi12` cannot be more than the sample proportions of zeros in either response.
 `zero` Which linear/additive predictor is modelled as an intercept only? A `NULL` means none. The default has both phi and the odds ratio as not being modelled as a function of the explanatory variables (apart from an intercept). `tol` Tolerance for testing independence. Should be some small positive numerical value. `addRidge` Some small positive numerical value. The first two diagonal elements of the working weight matrices are multiplied by `1+addRidge` to make it diagonally dominant, therefore positive-definite.

## Details

This VGAM family function fits an exchangeable bivariate odds ratio model (`binom2.or`) with a `clogloglink` link. The data are assumed to come from a zero-inflated Poisson (ZIP) distribution that has been converted to presence/absence. Explicitly, the default model is

cloglog[P(Y_j=1)/(1-phi)] = eta_1,\ \ \ j=1,2

for the (exchangeable) marginals, and

logit[phi] = eta_2,

for the mixing parameter, and

log[P(Y_{00}=1) P(Y_{11}=1) / (P(Y_{01}=1) P(Y_{10}=1))] = eta_3,

specifies the dependency between the two responses. Here, the responses equal 1 for a success and a 0 for a failure, and the odds ratio is often written psi=p00 p11 / (p10 p01). We have p10 = p01 because of the exchangeability.

The second linear/additive predictor models the phi parameter (see `zipoisson`). The third linear/additive predictor is the same as `binom2.or`, viz., the log odds ratio.

Suppose a dataset1 comes from a Poisson distribution that has been converted to presence/absence, and that both marginal probabilities are the same (exchangeable). Then `binom2.or("clogloglink", exch=TRUE)` is appropriate. Now suppose a dataset2 comes from a zero-inflated Poisson distribution. The first linear/additive predictor of `zipebcom()` applied to dataset2 is the same as that of `binom2.or("clogloglink", exch=TRUE)` applied to dataset1. That is, the phi has been taken care of by `zipebcom()` so that it is just like the simpler `binom2.or`.

Note that, for eta_1, `mu12 = prob12 / (1-phi12)` where `prob12` is the probability of a 1 under the ZIP model. Here, `mu12` correspond to `mu1` and `mu2` in the `binom2.or`-Poisson model.

If phi=0 then `zipebcom()` should be equivalent to `binom2.or("clogloglink", exch=TRUE)`. Full details are given in Yee and Dirnbock (2009).

The leading 2 x 2 submatrix of the expected information matrix (EIM) is of rank-1, not 2! This is due to the fact that the parameters corresponding to the first two linear/additive predictors are unidentifiable. The quick fix around this problem is to use the `addRidge` adjustment. The model is fitted by maximum likelihood estimation since the full likelihood is specified. Fisher scoring is implemented.

The default models eta2 and eta3 as single parameters only, but this can be circumvented by setting `zero=NULL` in order to model the phi and odds ratio as a function of all the explanatory variables.

## Value

An object of class `"vglmff"` (see `vglmff-class`). The object is used by modelling functions such as `vglm` and `vgam`.

When fitted, the `fitted.values` slot of the object contains the four joint probabilities, labelled as (Y1,Y2) = (0,0), (0,1), (1,0), (1,1), respectively. These estimated probabilities should be extracted with the `fitted` generic function.

## Warning

The fact that the EIM is not of full rank may mean the model is naturally ill-conditioned. Not sure whether there are any negative consequences wrt theory. For now it is certainly safer to fit `binom2.or` to bivariate binary responses.

## Note

The `"12"` in the argument names reinforce the user about the exchangeability assumption. The name of this VGAM family function stands for zero-inflated Poisson exchangeable bivariate complementary log-log odds-ratio model or ZIP-EBCOM.

See `binom2.or` for details that are pertinent to this VGAM family function too. Even better initial values are usually needed here.

The `xij` (see `vglm.control`) argument enables environmental variables with different values at the two time points to be entered into an exchangeable `binom2.or` model. See the author's webpage for sample code.

## References

Yee, T. W. and Dirnbock, T. (2009). Models for analysing species' presence/absence data at two time points. Journal of Theoretical Biology, 259(4), 684–694.

`binom2.or`, `zipoisson`, `clogloglink`, `CommonVGAMffArguments`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30``` ```zdata <- data.frame(x2 = seq(0, 1, len = (nsites <- 2000))) zdata <- transform(zdata, eta1 = -3 + 5 * x2, phi1 = logitlink(-1, inverse = TRUE), oratio = exp(2)) zdata <- transform(zdata, mu12 = clogloglink(eta1, inverse = TRUE) * (1-phi1)) tmat <- with(zdata, rbinom2.or(nsites, mu1 = mu12, oratio = oratio, exch = TRUE)) zdata <- transform(zdata, ybin1 = tmat[, 1], ybin2 = tmat[, 2]) with(zdata, table(ybin1, ybin2)) / nsites # For interest only ## Not run: # Various plots of the data, for interest only par(mfrow = c(2, 2)) plot(jitter(ybin1) ~ x2, data = zdata, col = "blue") plot(jitter(ybin2) ~ jitter(ybin1), data = zdata, col = "blue") plot(mu12 ~ x2, data = zdata, col = "blue", type = "l", ylim = 0:1, ylab = "Probability", main = "Marginal probability and phi") with(zdata, abline(h = phi1, col = "red", lty = "dashed")) tmat2 <- with(zdata, dbinom2.or(mu1 = mu12, oratio = oratio, exch = TRUE)) with(zdata, matplot(x2, tmat2, col = 1:4, type = "l", ylim = 0:1, ylab = "Probability", main = "Joint probabilities")) ## End(Not run) # Now fit the model to the data. fit <- vglm(cbind(ybin1, ybin2) ~ x2, zipebcom, data = zdata, trace = TRUE) coef(fit, matrix = TRUE) summary(fit) vcov(fit) ```

### Example output ```Loading required package: stats4
ybin2
ybin1      0      1
0 0.5355 0.0990
1 0.0995 0.2660
VGLM    linear loop  1 :  loglikelihood = -1953.76553
VGLM    linear loop  2 :  loglikelihood = -1902.30849
VGLM    linear loop  3 :  loglikelihood = -1891.93592
VGLM    linear loop  4 :  loglikelihood = -1889.40357
VGLM    linear loop  5 :  loglikelihood = -1889.29681
VGLM    linear loop  6 :  loglikelihood = -1889.29513
VGLM    linear loop  7 :  loglikelihood = -1889.29506
VGLM    linear loop  8 :  loglikelihood = -1889.29505
(Intercept)         -2.993937        -1.013226        1.985475
x2                   4.915662         0.000000        0.000000

Call:
vglm(formula = cbind(ybin1, ybin2) ~ x2, family = zipebcom, data = zdata,
trace = TRUE)

Pearson residuals:
Min      1Q   Median     3Q   Max
clogloglink(mu12) -1.179 -0.4883 -0.26717 0.3494 5.991
logitlink(phi12)  -1.767 -0.5415  0.09645 0.2186 1.967
loglink(oratio)   -2.302  0.0894  0.23671 0.4112 8.676

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept):1  -2.9939     0.1359 -22.027   <2e-16 ***
(Intercept):2  -1.0132     0.1224  -8.279   <2e-16 ***
(Intercept):3   1.9855     0.1235  16.074   <2e-16 ***
x2              4.9157     0.2955  16.637   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Log-likelihood: -1889.295 on 5996 degrees of freedom

Number of Fisher scoring iterations: 8

No Hauck-Donner effect found in any of the estimates

(Intercept):1 (Intercept):2 (Intercept):3          x2
(Intercept):1   0.018474161  -0.001766813    0.00000000 -0.03404331
(Intercept):2  -0.001766813   0.014978307    0.00000000  0.01850160
(Intercept):3   0.000000000   0.000000000    0.01525679  0.00000000
x2             -0.034043314   0.018501602    0.00000000  0.08729865
```

VGAM documentation built on Jan. 16, 2021, 5:21 p.m.