glm.poisson.disp: Overdispersed Poisson log-linear models

Description Usage Arguments Details Value Note References See Also Examples

View source: R/dispmod.R

Description

This function estimates overdispersed Poisson log-linear models using the approach discussed by Breslow N.E. (1984).

Usage

1
glm.poisson.disp(object, maxit = 30, verbose = TRUE)

Arguments

object

an object of class "glm" providing a fitted Poisson log-linear regression model; see glm.

maxit

integer giving the maximal number of iterations for the model fitting procedure.

verbose

logical, if TRUE information are printed during each step of the algorithm.

Details

Breslow (1984) proposed an iterative algorithm for fitting overdispersed Poisson log-linear models. The method is similar to that proposed by Williams (1982) for handling overdispersion in logistic regression models (see glm.binomial.disp).

Suppose we observe n independent responses such that

y_i | λ_i ~ Poisson(λ_i n_i)

for i = 1, …, n. The response variable y_i may be an event counts variable observed over a period of time (or in the space) of length n_i, whereas λ_i is the rate parameter. Then,

E(y_i | λ_i) = μ_i = λ_i n_i = exp(log(n_i) + log(λ_i))

where log(n_i) is an offset and log(λ_i) = β'x_i expresses the dependence of the Poisson rate parameter on a set of, say p, predictors. If the periods of time are all of the same length, we can set n_i = 1 for all i so the offset is zero.

The Poisson distribution has E(y_i | λ_i) = V(y_i | λ_i), but it may happen that the actual variance exceeds the nominal variance under the assumed probability model.

Suppose that θ_i = λ_i n_i is a random variable distributed according to

θ_i ~ Gamma (μ_i, 1/φ)

where E(θ_i) = μ_i and V(θ_i) = μ_i^2 φ. Thus, it can be shown that the unconditional mean and variance of y_i are given by

E(y_i) = μ_i

and

V(y_i) = μ_i + μ_i^2 φ = μ_i (1 + μ_iφ)

Hence, for φ > 0 we have overdispersion. It is interesting to note that the same mean and variance arise also if we assume a negative binomial distribution for the response variable.

The method proposed by Breslow uses an iterative algorithm for estimating the dispersion parameter φ and hence the necessary weights 1/(1 + μ_i \hat{φ}) (for details see Breslow, 1984).

Value

The function returns an object of class "glm" with the usual information and the added components:

dispersion

the estimated dispersion parameter.

disp.weights

the final weights used to fit the model.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Breslow, N.E. (1984), Extra-Poisson variation in log-linear models, Applied Statistics, 33, 38–44.

See Also

lm, glm, lm.disp, glm.binomial.disp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Salmonella TA98 data
data(salmonellaTA98)
salmonellaTA98 <- within(salmonellaTA98, logx10 <- log(x+10))
mod <- glm(y ~ logx10 + x, data = salmonellaTA98, family = poisson(log)) 
summary(mod)

mod.disp <- glm.poisson.disp(mod)
summary(mod.disp)
mod.disp$dispersion

# compute predictions on a grid of x-values...
x0 <- with(salmonellaTA98, seq(min(x), max(x), length=50))
eta0 <- predict(mod, newdata = data.frame(logx10 = log(x0+10), x = x0), se=TRUE)
eta0.disp <- predict(mod.disp, newdata = data.frame(logx10 = log(x0+10), x = x0), se=TRUE)
# ... and plot the mean functions with variability bands
plot(y ~ x, data = salmonellaTA98)
lines(x0, exp(eta0$fit))
lines(x0, exp(eta0$fit+2*eta0$se), lty=2)
lines(x0, exp(eta0$fit-2*eta0$se), lty=2)
lines(x0, exp(eta0.disp$fit), col=3)
lines(x0, exp(eta0.disp$fit+2*eta0.disp$se), lty=2, col=3)
lines(x0, exp(eta0.disp$fit-2*eta0.disp$se), lty=2, col=3)

## Holford's data
data(holford)

mod <- glm(incid ~ offset(log(pop)) + Age + Cohort, data = holford, 
           family = poisson(log)) 
summary(mod)

mod.disp <- glm.poisson.disp(mod)
summary(mod.disp)
mod.disp$dispersion

Example output

Call:
glm(formula = y ~ logx10 + x, family = poisson(log), data = salmonellaTA98)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.5173  -1.1731  -0.4032   0.7995   3.7041  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  2.1727730  0.2184269   9.947  < 2e-16 ***
logx10       0.3198250  0.0570014   5.611 2.01e-08 ***
x           -0.0010130  0.0002452  -4.131 3.61e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 78.358  on 17  degrees of freedom
Residual deviance: 43.716  on 15  degrees of freedom
AIC: 142.25

Number of Fisher Scoring iterations: 4


Poisson overdispersed log-linear model fitting...
Iter.  1  phi: 0.07178889 
Iter.  2  phi: 0.07180372 
Iter.  3  phi: 0.07180614 
Iter.  4  phi: 0.07180702 
Iter.  5  phi: 0.07180731 
Iter.  6  phi: 0.07180741 
Iter.  7  phi: 0.07180744 
Iter.  8  phi: 0.07180745 
Iter.  9  phi: 0.07180745 
Iter.  10  phi: 0.07180745 
Converged after 10 iterations. 
Estimated dispersion parameter: 0.07180745 

Call:
glm(formula = y ~ logx10 + x, family = poisson(log), data = salmonellaTA98, 
    weights = disp.weights)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.4419  -0.6272  -0.2374   0.4564   1.9937  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  2.2030681  0.3635996   6.059 1.37e-09 ***
logx10       0.3109623  0.0990523   3.139  0.00169 ** 
x           -0.0009741  0.0004371  -2.228  0.02585 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 25.313  on 17  degrees of freedom
Residual deviance: 14.214  on 15  degrees of freedom
AIC: 50.799

Number of Fisher Scoring iterations: 4


Call:
glm(formula = y ~ logx10 + x, family = poisson(log), data = salmonellaTA98, 
    weights = disp.weights)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.4419  -0.6272  -0.2374   0.4564   1.9937  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)  2.2030681  0.3635996   6.059 1.37e-09 ***
logx10       0.3109623  0.0990523   3.139  0.00169 ** 
x           -0.0009741  0.0004371  -2.228  0.02585 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 25.313  on 17  degrees of freedom
Residual deviance: 14.214  on 15  degrees of freedom
AIC: 50.799

Number of Fisher Scoring iterations: 4

[1] 0.07180745

Call:
glm(formula = incid ~ offset(log(pop)) + Age + Cohort, family = poisson(log), 
    data = holford)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-3.214  -1.205  -0.172   1.462   3.670  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -8.62551    0.10123 -85.208  < 2e-16 ***
Age55-59       0.81903    0.02871  28.526  < 2e-16 ***
Age60-64       1.55271    0.02790  55.655  < 2e-16 ***
Age65-69       2.13322    0.02751  77.540  < 2e-16 ***
Age70-74       2.68681    0.02803  95.844  < 2e-16 ***
Age75-79       3.13410    0.02910 107.689  < 2e-16 ***
Age80-84       3.45545    0.03143 109.930  < 2e-16 ***
Cohort1860-64  0.36540    0.10901   3.352 0.000803 ***
Cohort1865-69  0.52338    0.10221   5.120 3.05e-07 ***
Cohort1870-74  0.77303    0.09976   7.749 9.25e-15 ***
Cohort1875-79  1.00959    0.09883  10.216  < 2e-16 ***
Cohort1880-84  1.15061    0.09832  11.702  < 2e-16 ***
Cohort1885-89  1.30700    0.09803  13.333  < 2e-16 ***
Cohort1890-94  1.50956    0.09856  15.317  < 2e-16 ***
Cohort1895-99  1.55288    0.09882  15.713  < 2e-16 ***
Cohort1900-04  1.59181    0.09932  16.027  < 2e-16 ***
Cohort1905-09  1.46105    0.10074  14.503  < 2e-16 ***
Cohort1910-14  1.36926    0.10396  13.171  < 2e-16 ***
Cohort1915-19  1.23725    0.11616  10.651  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 35493.66  on 48  degrees of freedom
Residual deviance:   127.38  on 30  degrees of freedom
AIC: 571.65

Number of Fisher Scoring iterations: 3


Poisson overdispersed log-linear model fitting...
Iter.  1  phi: 0.004150272 
Iter.  2  phi: 0.003747225 
Iter.  3  phi: 0.003651023 
Iter.  4  phi: 0.003626604 
Iter.  5  phi: 0.003620276 
Iter.  6  phi: 0.003618627 
Iter.  7  phi: 0.003618197 
Iter.  8  phi: 0.003618085 
Iter.  9  phi: 0.003618056 
Iter.  10  phi: 0.003618048 
Iter.  11  phi: 0.003618046 
Iter.  12  phi: 0.003618045 
Iter.  13  phi: 0.003618045 
Converged after 13 iterations. 
Estimated dispersion parameter: 0.003618045 

Call:
glm(formula = incid ~ offset(log(pop)) + Age + Cohort, family = poisson(log), 
    data = holford, weights = disp.weights)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.77737  -0.50352  -0.08971   0.58891   1.74713  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -8.64453    0.12539 -68.940  < 2e-16 ***
Age55-59       0.82267    0.04461  18.443  < 2e-16 ***
Age60-64       1.54904    0.04520  34.269  < 2e-16 ***
Age65-69       2.12764    0.04619  46.066  < 2e-16 ***
Age70-74       2.69624    0.04785  56.348  < 2e-16 ***
Age75-79       3.17241    0.05003  63.408  < 2e-16 ***
Age80-84       3.47447    0.05335  65.129  < 2e-16 ***
Cohort1860-64  0.35468    0.13296   2.668  0.00764 ** 
Cohort1865-69  0.51920    0.12542   4.140 3.48e-05 ***
Cohort1870-74  0.77428    0.12250   6.321 2.60e-10 ***
Cohort1875-79  1.01212    0.12128   8.345  < 2e-16 ***
Cohort1880-84  1.15071    0.12066   9.536  < 2e-16 ***
Cohort1885-89  1.29932    0.12035  10.796  < 2e-16 ***
Cohort1890-94  1.54559    0.12230  12.638  < 2e-16 ***
Cohort1895-99  1.57541    0.12344  12.763  < 2e-16 ***
Cohort1900-04  1.62779    0.12502  13.021  < 2e-16 ***
Cohort1905-09  1.46431    0.12792  11.447  < 2e-16 ***
Cohort1910-14  1.37192    0.13347  10.279  < 2e-16 ***
Cohort1915-19  1.25628    0.15029   8.359  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 9139.332  on 48  degrees of freedom
Residual deviance:   29.984  on 30  degrees of freedom
AIC: 193.63

Number of Fisher Scoring iterations: 3


Call:
glm(formula = incid ~ offset(log(pop)) + Age + Cohort, family = poisson(log), 
    data = holford, weights = disp.weights)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-1.77737  -0.50352  -0.08971   0.58891   1.74713  

Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
(Intercept)   -8.64453    0.12539 -68.940  < 2e-16 ***
Age55-59       0.82267    0.04461  18.443  < 2e-16 ***
Age60-64       1.54904    0.04520  34.269  < 2e-16 ***
Age65-69       2.12764    0.04619  46.066  < 2e-16 ***
Age70-74       2.69624    0.04785  56.348  < 2e-16 ***
Age75-79       3.17241    0.05003  63.408  < 2e-16 ***
Age80-84       3.47447    0.05335  65.129  < 2e-16 ***
Cohort1860-64  0.35468    0.13296   2.668  0.00764 ** 
Cohort1865-69  0.51920    0.12542   4.140 3.48e-05 ***
Cohort1870-74  0.77428    0.12250   6.321 2.60e-10 ***
Cohort1875-79  1.01212    0.12128   8.345  < 2e-16 ***
Cohort1880-84  1.15071    0.12066   9.536  < 2e-16 ***
Cohort1885-89  1.29932    0.12035  10.796  < 2e-16 ***
Cohort1890-94  1.54559    0.12230  12.638  < 2e-16 ***
Cohort1895-99  1.57541    0.12344  12.763  < 2e-16 ***
Cohort1900-04  1.62779    0.12502  13.021  < 2e-16 ***
Cohort1905-09  1.46431    0.12792  11.447  < 2e-16 ***
Cohort1910-14  1.37192    0.13347  10.279  < 2e-16 ***
Cohort1915-19  1.25628    0.15029   8.359  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 9139.332  on 48  degrees of freedom
Residual deviance:   29.984  on 30  degrees of freedom
AIC: 193.63

Number of Fisher Scoring iterations: 3

[1] 0.003618045

dispmod documentation built on May 2, 2019, 2:48 p.m.