lm.disp: Gaussian dispersion models

Description Usage Arguments Details Value Note References See Also Examples

View source: R/dispmod.R

Description

This function estimates Gaussian dispersion regression models.

Usage

1
2
3
lm.disp(formula, var.formula, data = list(), maxit = 30, 
        epsilon = glm.control()$epsilon, subset, na.action = na.omit, 
        contrasts = NULL, offset = NULL)

Arguments

formula

a symbolic description of the mean function of the model to be fit. For the details of model formula specification see lm and formula.

var.formula

a symbolic description of the variance function of the model to be fit. This must be a one-sided formula; if omitted the same terms used for the mean function are used. For the details of model formula specification see lm and formula.

data

an optional data frame containing the variables in the model. By default the variables are taken from environment(formula), typically the environment from which the function is called.

maxit

integer giving the maximal number of iterations for the model fitting procedure.

epsilon

tolerance value for checking convergence. See glm.control.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

na.action

a function which indicates what should happen when the data contain NA's. By default is set to na.omit, but other possibilities are available; see na.omit.

contrasts

an optional list as described in the contrasts.arg argument of model.matrix.default.

offset

this can be used to specify an a priori known component to be included in the linear predictor during fitting. An offset term can be included in the formula instead or as well, and if both are specified their sum is used.

Details

Gaussian dispersion models allow to model variance heterogeneity in Gaussian regression analysis using a log-linear model for the variance.

Suppose a response y is modelled as a function of a set of p predictors x through the linear model

y_i = β'x_i + e_i

where e_i ~ N(0, σ^2) under homogeneity.

Variance heterogeneity is modelled as

V(e_i) = σ^2 = exp(λ'z_i)

where z_i may contain some or all the variables in x_i and other variables not included in x_i; z_i is however assumed to contain a constant term.

The full model can be expressed as

E(y|x) = β'x

V(y|x) = exp(λ'z)

and it is fitted by maximum likelihood following the algorithm described in Aitkin (1987).

Value

lm.dispmod() returns an object of class "dispmod".

The summary method can be used to obtain and print a summary of the results.

An object of class "dispmod" is a list containing the following components:

call

the matched call.

mean

an object of class "glm" giving the fitted model for the mean function; see glm

var

an object of class "glm" giving the fitted model for the variance function; see glm.

initial.deviance

the value of the deviance at the beginning of the iterative procedure, i.e. assuming constant variance.

deviance

the value of the deviance at the end of the iterative procedure.

Note

Based on a similar procedure available in Arc (Cook and Weisberg, http://www.stat.umn.edu/arc)

References

Aitkin, M. (1987), Modelling variance heterogeneity in normal regression models using GLIM, Applied Statistics, 36, 332–339.

See Also

lm, glm, glm.binomial.disp, glm.poisson.disp, ncvTest.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
data(minitab)
minitab <- within(minitab, y <- V^(1/3) )
mod <- lm(y ~ H + D, data = minitab)
summary(mod)

mod.disp1 <- lm.disp(y ~ H + D, data = minitab)
summary(mod.disp1)

mod.disp2 <- lm.disp(y ~ H + D, ~ H, data = minitab)
summary(mod.disp2)

# Likelihood ratio test
deviances <- c(mod.disp1$initial.deviance,
               mod.disp2$deviance, 
               mod.disp1$deviance)
lrt <- c(NA, abs(diff(deviances)))
cbind(deviances, lrt, p.value = 1-pchisq(lrt, 1))

# quadratic dispersion model on D (as discussed by Aitkin)
mod.disp4 <- lm.disp(y ~ H + D, ~ D + I(D^2), data = minitab)
summary(mod.disp4)

r <- mod$residuals
phi.est <- mod.disp4$var$fitted.values
plot(minitab$D, log(r^2))
lines(minitab$D, log(phi.est))

Example output

Call:
lm(formula = y ~ H + D, data = minitab)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.159602 -0.050200 -0.006827  0.069649  0.133981 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.085388   0.184315  -0.463    0.647    
H            0.014472   0.002777   5.211 1.56e-05 ***
D            0.151516   0.005639  26.871  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08283 on 28 degrees of freedom
Multiple R-squared:  0.9777,	Adjusted R-squared:  0.9761 
F-statistic: 612.5 on 2 and 28 DF,  p-value: < 2.2e-16

Iteration 1: deviance = -69.62317
Iteration 2: deviance = -74.79983
Iteration 3: deviance = -74.79994
Iteration 4: deviance = -74.79994
Iteration 5: deviance = -74.79994
Gaussian dispersion model
-------------------------

Call:
lm.disp(formula = y ~ H + D, data = minitab)

Model for the mean:
y ~ H + D 
            Estimate Std. Error z value   Pr(>|z|)
(Intercept) -0.09892   0.128846 -0.7678  4.426e-01
H            0.01486   0.002110  7.0401  1.921e-12
D            0.15029   0.004879 30.8052 2.235e-208

(Dispersion parameter for gaussian family taken to be 1)

    Null deviance: 1773.4  on 30  degrees of freedom
Residual deviance:   31.0  on 28  degrees of freedom

Model for the variance:
~H + D 
             Estimate Std. Error z value  Pr(>|z|)
(Intercept) -14.04413    3.14704 -4.4626 8.096e-06
H             0.10972    0.04742  2.3141 2.066e-02
D             0.03429    0.09628  0.3561 7.217e-01

(Dispersion parameter for Gamma family taken to be 2)

    Null deviance: 83.542  on 30  degrees of freedom
Residual deviance: 73.080  on 28  degrees of freedom

-2*logLik(max), constant var. = -69.62317 
-2*logLik(max), model         = -74.79994 
LRT = 5.176767 on 2 df, p-value = 0.075141
Iteration 1: deviance = -69.62317
Iteration 2: deviance = -74.73305
Iteration 3: deviance = -74.73319
Iteration 4: deviance = -74.73319
Iteration 5: deviance = -74.73319
Gaussian dispersion model
-------------------------

Call:
lm.disp(formula = y ~ H + D, var.formula = ~H, data = minitab)

Model for the mean:
y ~ H + D 
            Estimate Std. Error z value   Pr(>|z|)
(Intercept) -0.10416   0.129982 -0.8013  4.229e-01
H            0.01493   0.002133  7.0002  2.555e-12
D            0.15029   0.004835 31.0830 4.090e-212

(Dispersion parameter for gaussian family taken to be 1)

    Null deviance: 1821.3  on 30  degrees of freedom
Residual deviance:   31.0  on 28  degrees of freedom

Model for the variance:
~H 
            Estimate Std. Error z value  Pr(>|z|)
(Intercept) -14.0110    3.09013  -4.534 5.785e-06
H             0.1153    0.04052   2.845 4.438e-03

(Dispersion parameter for Gamma family taken to be 2)

    Null deviance: 82.943  on 30  degrees of freedom
Residual deviance: 72.609  on 29  degrees of freedom

-2*logLik(max), constant var. = -69.62317 
-2*logLik(max), model         = -74.73319 
LRT = 5.110013 on 1 df, p-value = 0.023788
     deviances        lrt    p.value
[1,] -69.62317         NA         NA
[2,] -74.73319 5.11001338 0.02378814
[3,] -74.79994 0.06675401 0.79612294
Iteration 1: deviance = -69.62317
Iteration 2: deviance = -83.59174
Iteration 3: deviance = -84.86876
Iteration 4: deviance = -85.33767
Iteration 5: deviance = -85.46138
Iteration 6: deviance = -85.48339
Iteration 7: deviance = -85.48639
Iteration 8: deviance = -85.48675
Iteration 9: deviance = -85.48679
Iteration 10: deviance = -85.48679
Iteration 11: deviance = -85.48679
Iteration 12: deviance = -85.48679
Iteration 13: deviance = -85.48679
Gaussian dispersion model
-------------------------

Call:
lm.disp(formula = y ~ H + D, var.formula = ~D + I(D^2), data = minitab)

Model for the mean:
y ~ H + D 
            Estimate Std. Error z value  Pr(>|z|)
(Intercept)  0.09547  0.0530634   1.799 7.199e-02
H            0.01167  0.0009712  12.014 3.013e-33
D            0.15270  0.0016783  90.984 0.000e+00

(Dispersion parameter for gaussian family taken to be 1)

    Null deviance: 69793  on 30  degrees of freedom
Residual deviance:    31  on 28  degrees of freedom

Model for the variance:
~D + I(D^2) 
            Estimate Std. Error z value  Pr(>|z|)
(Intercept) -41.3974    4.75937  -8.698 3.375e-18
D             5.1722    0.69860   7.404 1.325e-13
I(D^2)       -0.1768    0.02467  -7.168 7.616e-13

(Dispersion parameter for Gamma family taken to be 2)

    Null deviance: 104.500  on 30  degrees of freedom
Residual deviance:  67.675  on 28  degrees of freedom

-2*logLik(max), constant var. = -69.62317 
-2*logLik(max), model         = -85.48679 
LRT = 15.86362 on 2 df, p-value = 0.00035914

dispmod documentation built on May 2, 2019, 2:48 p.m.

Related to lm.disp in dispmod...