glm.binomial.disp: Overdispersed binomial logit models

Description Usage Arguments Details Value Note References See Also Examples

View source: R/dispmod.R

Description

This function estimates overdispersed binomial logit models using the approach discussed by Williams (1982).

Usage

1
glm.binomial.disp(object, maxit = 30, verbose = TRUE)

Arguments

object

an object of class "glm" providing a fitted binomial logistic regression model; see glm.

maxit

integer giving the maximal number of iterations for the model fitting procedure.

verbose

logical, if TRUE information are printed during each step of the algorithm.

Details

Extra-binomial variation in logistic linear models is discussed, among others, in Collett (1991). Williams (1982) proposed a quasi-likelihood approach for handling overdispersion in logistic regression models.

Suppose we observe the number of successes y_i in m_i trials, for i = 1, …, n, such that

y_i | p_i ~ Binomial(m_i, p_i)

p_i ~ Beta(γ, δ)

Under this model, each of the n binomial observations has a different probability of success p_i, where p_i is a random draw from a Beta distribution. Thus,

E(p_i) = γ/(γ+δ) = θ

V(p_i) = φ θ (1-θ)

Assuming γ > 1 and δ > 1, the Beta density is zero at the extreme values of zero and one, and thus 0 < φ <= 1/3. From this, the unconditional mean and variance can be calculated:

E(y_i) = m_i θ

V(y_i) = m_i θ (1 - θ)(1 + (m_i - 1) φ)

so unless m_i = 1 or φ = 0, the unconditional variance of y_i is larger than binomial variance.

Identical expressions for the mean and variance of y_i can be obtained if we assume that the m_i counts on the i-th unit are dependent, with the same correlation φ. In this case, -1/(m_i - 1) < φ <= 1.

The method proposed by Williams uses an iterative algorithm for estimating the dispersion parameter φ and hence the necessary weights 1/(1 + φ(m_i - 1)) (for details see Williams, 1982).

Value

The function returns an object of class "glm" with the usual information and the added components:

dispersion

the estimated dispersion parameter.

disp.weights

the final weights used to fit the model.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Collett, D. (1991), Modelling Binary Data, London: Chapman and Hall.

Williams, D. A. (1982), Extra-binomial variation in logistic linear models, Applied Statistics, 31, 144–148.

See Also

lm, glm, lm.disp, glm.poisson.disp

Examples

1
2
3
4
5
6
7
8
9
data(orobanche)

mod <- glm(cbind(germinated, seeds-germinated) ~ host*variety, data = orobanche,
           family = binomial(logit))
summary(mod)

mod.disp <- glm.binomial.disp(mod)
summary(mod.disp)
mod.disp$dispersion

Example output

Call:
glm(formula = cbind(germinated, seeds - germinated) ~ host * 
    variety, family = binomial(logit), data = orobanche)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.01617  -1.24398   0.05995   0.84695   2.12123  

Coefficients:
                      Estimate Std. Error z value Pr(>|z|)  
(Intercept)            -0.4122     0.1842  -2.238   0.0252 *
hostCuke                0.5401     0.2498   2.162   0.0306 *
varietyO.a75           -0.1459     0.2232  -0.654   0.5132  
hostCuke:varietyO.a75   0.7781     0.3064   2.539   0.0111 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 98.719  on 20  degrees of freedom
Residual deviance: 33.278  on 17  degrees of freedom
AIC: 117.87

Number of Fisher Scoring iterations: 4


Binomial overdispersed logit model fitting...
Iter.  1  phi: 0.02371848 
Iter.  2  phi: 0.0248754 
Iter.  3  phi: 0.02493477 
Iter.  4  phi: 0.02493781 
Iter.  5  phi: 0.02493797 
Iter.  6  phi: 0.02493797 
Converged after 6 iterations. 
Estimated dispersion parameter: 0.02493797 

Call:
glm(formula = cbind(germinated, seeds - germinated) ~ host * 
    variety, family = binomial(logit), data = orobanche, weights = disp.weights)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.90450  -0.85787   0.01759   0.76382   1.36185  

Coefficients:
                      Estimate Std. Error z value Pr(>|z|)  
(Intercept)           -0.46533    0.24387  -1.908   0.0564 .
hostCuke               0.51023    0.33472   1.524   0.1274  
varietyO.a75          -0.07009    0.31146  -0.225   0.8220  
hostCuke:varietyO.a75  0.81956    0.43522   1.883   0.0597 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 47.243  on 20  degrees of freedom
Residual deviance: 18.442  on 17  degrees of freedom
AIC: 65.578

Number of Fisher Scoring iterations: 4


Call:
glm(formula = cbind(germinated, seeds - germinated) ~ host * 
    variety, family = binomial(logit), data = orobanche, weights = disp.weights)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.90450  -0.85787   0.01759   0.76382   1.36185  

Coefficients:
                      Estimate Std. Error z value Pr(>|z|)  
(Intercept)           -0.46533    0.24387  -1.908   0.0564 .
hostCuke               0.51023    0.33472   1.524   0.1274  
varietyO.a75          -0.07009    0.31146  -0.225   0.8220  
hostCuke:varietyO.a75  0.81956    0.43522   1.883   0.0597 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 47.243  on 20  degrees of freedom
Residual deviance: 18.442  on 17  degrees of freedom
AIC: 65.578

Number of Fisher Scoring iterations: 4

[1] 0.02493797

dispmod documentation built on May 2, 2019, 2:48 p.m.