gof: Goodness of fit tests for binomial regression
In LogisticDx: Diagnostic Tests for Models with a Binomial Response

Description Usage Arguments Details Value Note Author(s) References Examples

Goodness of fit tests for binomial regression

gof(x, ...)

## S3 method for class 'glm'
gof(x, ..., g = 10, plotROC = TRUE)

`x`	A regression model with class `glm` and `x$family$family == "binomial"`.
`...`	Additional arguments when plotting the receiver-operating curve. See: ?pROC::roc and ?pROC::plot.roc
`g`	Number of groups (quantiles) into which to split observations for the Hosmer-Lemeshow and the modified Hosmer-Lemeshow tests.
`plotROC`	Plot a receiver operating curve?

Details of the elements in the returned list follow below:

ct:
A contingency table, similar to the output of dx.
The following are given per covariate group:

n	number of observations
y1hat	predicted number of observations with y=1
y1	actual number of observations with y=1
y0hat	predicted number of observations with y=0
y0	actual number of observations with y=0

chiSq:
Chi-squared tests of the significance of the model.
Pearsons test and the deviance D test are given.
These are calculated by indididual I, by covariate group G and also from the contingency table CT above. They are calculated as:

Chisq = SUM Pr[i]^2

and

D = SUM dr[i]^2

The statistics should follow a chiSq distribution with n - p degrees of freedom.
Here, n is the number of observations (taken individually or by covariate group) and p is the number pf predictors in the model.
A high p value for the test suggests that the model is a poor fit.
The assumption of a chiSq distribution is most valid when observations are considered by group.
The statistics from the contingency table should be similar to those obtained when caluclated by group.

ctHL:
The contingency table for the Hosmer-Lemeshow test.
The observations are ordered by probability, then grouped into g groups of approximately equal size.
The columns are:

P	the probability
y1	the actual number of observations with y=1
y1hat	the predicted number of observations with y=1
y0	the actual number of observations with y=0
y0hat	the predicted number of observations with y=0
n	the number of observations
Pbar	the mean probability, which is (n P) / SUM(n)*

gof:
All of these tests rely on assessing the effect of adding an additional variable to the model.
Thus a low p value for any of these tests implies that the model is a poor fit.

Hosmer and Lemeshow tests

Hosmer and Lemeshows C statistic is based on: y[k], the number of observations where y=1, n[k], the number of observations and Pbar[k], the average probability in group k:

Pbar[k] = SUM(i) n[i]P[i] / n[k], k=1,2...g

The statistic is:

C = SUM(k=1...g) (y[k] - n[k]Pbar[k])^2 / n[k]Pbar[k](1-Pbar[k])

This should follow a chiSq distribution with g - 2 degrees of freedom.

The modified Hosmer and Lemeshow test is assesses the change in model deviance D when G is added as a predictor. That is, a linear model is fit as:

dr[i] ~ G, dr[i] = deviance residual

and the effect of adding G assessed with anova(lm(dr ~ G)).

Osius and Rojek's tests

These are based on a power-divergence statistic PD[l] (l=1 for Pearsons test) and the standard deviation (herein, of a binomial distribution) SD. The statistic is:

Z[OR] = PD[l] - lbar / SD[l]

For logistic regression, it is calculated as:

Z[OR] = (chiSq - (n - p)) / (2 * n * SUM 1/n[i])^0.5

where RSS is the residual sum-of-squares from a weighted linear regression:

(1 - 2 * P[i]) / SD[i] ~ X, weights = SD[i]

Here \bold{X} is the matrix of model predictors.
A two-tailed test against a standard normal distribution N ~ (0, 1) should not be significant.

Stukels tests

These are based on the addition of the vectors:

z[1] = Pgeq0.5 = sign(P[i] >= 0.5)

and / or

z[2] = Pl0.5 = sign(P[i] < 0.5)

to the existing model predictors.
The model fit is compared to the original using the score (e.g. SstPgeq0.5) and likelihood-ratio (e.g. SllPl0.5) tests. These models should not be a significantly better fit to the data.

R2:

Pseudo-R^2 comparisons of the predicted values from the fitted model vs. an intercept-only model.

sum-of-squares

The sum-of-squres (linear-regression) measure based on the squared Pearson correlation coefficient by individual is based on the mean probability:

Pbar = SUM(n[i]) / n

and is given by:

R2[ssI] = 1 - SUM(y[i] - P[i])^2 / SUM(y[i] - Pbar)^2

The same measure, by covariate group, is:

R2[ssG] = 1 - SUM(y[i] - n[i] * P[i])^2 / SUM(y[i] - n[i] * Pbar)^2

log-likelihood

The log-likelihood based R^2 measure per individual is based on:

ll[0], the log-likelihood of the intercept-only model
ll[p], the log-likelihood of the model with p covariates

It is calculated as

R2[llI] = (ll[0] - ll[p]) / ll[0]

This measure per covariate group is based on ll[s], the log-likelihood for the saturated model, which is calculated from the model deviance D:

ll[s] = ll[p] - D / 2

It is cacluated as:

R2[llG] = (ll[0] - ll[p]) / (ll[0] - ll[s])

auc:

The area under the receiver-operating curve.
This may broadly be interpreted as:

auc	Discrimination
auc=0.5	useless
0.7 <= auc < 0.8	acceptable
0.8 <= auc < 0.9	excellent
auc >= 0.9	outstanding

auc >= 0.9 occurs rarely as this reuqires almost complete separation/ perfect classification.

A list of data.tables as follows:

ct

Contingency table.

chiSq

Chi-squared tests of the significance of the model. The tests are:

PrI	test of the Pearsons residuals, calculated by individual
drI	test of the deviance residuals, calculated by individual
PrG	test of the Pearsons residuals, calculated by covariate group
drG	test of the deviance residuals, calculated by covariate group
PrCT	test of the Pearsons residuals, calculated from the contingency table
drCT	test of the deviance residuals, calculated from the contingency table

ctHL

Contingency table for the Hosmer-Lemeshow test.

gof

Goodness-of-fit tests. These are:

HL Hosmer-Lemeshow's C statistic.
mHL The modified Hosmer-Lemeshow test.
OsRo Osius and Rojek's test of the link function.

S Stukel's tests:

SstPgeq0.5	score test for addition of vector z1
SstPl0.5	score test for addition of vector z2
SstBoth	score test for addition of vector z2
SllPgeq0.5	log-likelihood test for addition of vector z1
SllPl0.5	log-likelihood test for addition of vector z2
SllBoth	log-likelihood test for addition of vectors z1 and z2

R2

R-squared like tests:

ssI	sum-of-squares, by individual
ssG	sum-of-squares, by covariate group
llI	log-likelihood, by individual
llG	log-likelihood, by covariate group.

auc

Area under the receiver-operating curve (ROC) with 95 % CIs.

Additionally, if plotROC=TRUE, a plot of the ROC.

The returned list has the additional class of "gof.glm".
The print method for this class shows only those results which have a p value.

Modified Hosmer & Lemeshow goodness of fit test: adapted from existing work by Yongmei Ni.

Osius G & Rojek D, 1992. Normal goodness-of-fit tests for multinomial models with large degrees of freedom. Journal of the American Statistical Association. 87(420):1145-52. JASA (paywall).
JSTOR (free):
http://www.jstor.org/stable/2290653

Hosmer D, Hosmer T, Le Cessie S & Lemeshow S (1997). A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine. 16(9):965-80. Wiley (paywall). Duke University (free).

Mittlboch M, Schemper M (1996). Explained variation for logistic regression. Statistics in Medicine. 15(19):1987-97. Wiley (paywall). CiteSeerX / Penn State University (free).

## H&L 2nd ed. Sections 5.2.2, 5.2.4, 5.2.5. Pages 147-167.
data(uis)
uis <- within(uis, {
    NDRGFP1 <- 10 / (NDRGTX + 1)
    NDRGFP2 <- NDRGFP1 * log((NDRGTX + 1) / 10)
})
g1 <- gof(glm(DFREE ~ AGE + NDRGFP1 + NDRGFP2 + IVHX +
              RACE + TREAT + SITE +
              AGE:NDRGFP1 + RACE:SITE,
              family=binomial, data=uis), plot=FALSE)
g1
unclass(g1)
attributes(g1$gof)

      chiSq  df    pVal
PrI  580.74 564 0.30389
drI  597.96 564 0.15592
PrG  511.78 510 0.46949
drG  530.74 510 0.25410
PrCT 511.78 510 0.46949
drCT 530.74 510 0.25410
                      val df     pVal  
HL chiSq         4.394114  8 0.819931  
mHL F            0.763346  9 0.650553  
OsRo Z           0.115106 NA 0.908361  
SstPgeq0.5 Z     1.297926 NA 0.194313  
SstPl0.5 Z       0.107475 NA 0.914412  
SstBoth chiSq    1.696164  2 0.428236  
SllPgeq0.5 chiSq 3.387409  1 0.065696 .
SllPl0.5 chiSq   0.011483  1 0.914662  
SllBoth chiSq    3.953985  2 0.138485  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$ct
     n      y1hat y1     y0hat y0
  1: 1 0.03236867  0 0.9676313  1
  2: 1 0.02857757  0 0.9714224  1
  3: 2 0.92867458  1 1.0713254  1
  4: 1 0.04242218  0 0.9575778  1
  5: 1 0.04048063  0 0.9595194  1
 ---                             
517: 1 0.22337160  1 0.7766284  0
518: 1 0.22004414  1 0.7799559  0
519: 1 0.16759820  1 0.8324018  0
520: 1 0.16262781  1 0.8373722  0
521: 1 0.03262591  1 0.9673741  0

$chiSq
   test    chiSq  df      pVal
1:  PrI 580.7352 564 0.3038915
2:  drI 597.9629 564 0.1559194
3:  PrG 511.7807 510 0.4694862
4:  drG 530.7412 510 0.2541018
5: PrCT 511.7807 510 0.4694862
6: drCT 530.7412 510 0.2541018

$ctHL
         P y1     y1hat y0    y0hat  n       Pbar
 1: 0.0939  4  4.099756 54 53.90024 58 0.07068545
 2:  0.125  5  6.229210 52 50.77079 57 0.10928438
 3:  0.163  8  8.538861 50 49.46114 58 0.14722174
 4:  0.204 11 10.401876 46 46.59812 57 0.18248905
 5:  0.234 16 12.705793 42 45.29421 58 0.21906540
 6:  0.279 11 14.463885 46 42.53611 57 0.25375237
 7:  0.324 18 17.506976 40 40.49302 58 0.30184442
 8:  0.376 24 19.796622 33 37.20338 57 0.34730917
 9:  0.459 23 23.915891 35 34.08411 58 0.41234295
10:  0.728 27 29.341129 30 27.65887 57 0.51475666

$gof
         test  stat        val df       pVal
1:         HL chiSq 4.39411425  8 0.81993079
2:        mHL     F 0.76334613  9 0.65055345
3:       OsRo     Z 0.11510645 NA 0.90836075
4: SstPgeq0.5     Z 1.29792647 NA 0.19431260
5:   SstPl0.5     Z 0.10747461 NA 0.91441247
6:    SstBoth chiSq 1.69616390  2 0.42823552
7: SllPgeq0.5 chiSq 3.38740879  1 0.06569612
8:   SllPl0.5 chiSq 0.01148318  1 0.91466236
9:    SllBoth chiSq 3.95398469  2 0.13848513

$R2
   method         R2
1:    ssI 0.09464760
2:    ssG 0.09959974
3:    llI 0.08530447
4:    llG 0.09508152

$auc
         auc lower 95% CI upper 95% CI 
    69.89081     65.13074     74.65087 
attr(,"interpret")
[1] "auc = 0.5       --> useless"   "0.7 < auc < 0.8 --> good"     
[3] "0.8 < auc < 0.9 --> excellent"

attr(,"link")
[1] "logit"
$names
[1] "test" "stat" "val"  "df"   "pVal"

$row.names
[1] 1 2 3 4 5 6 7 8 9

$class
[1] "data.table" "data.frame"

$.internal.selfref
<pointer: 0x1e37b88>

$interpret
[1] "Low p value: reject H0 that the model is a good fit."
[2] "I.e. model is a poor fit."                           

$g
[1] "10 groups"

LogisticDx documentation built on May 2, 2019, 6:15 p.m.

LogisticDx index

README.md plotGlm

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

LogisticDx
Diagnostic Tests for Models with a Binomial Response

gof: Goodness of fit tests for binomial regression
In LogisticDx: Diagnostic Tests for Models with a Binomial Response

Description

Usage

Arguments

Details

Hosmer and Lemeshow tests

Osius and Rojek's tests

Stukels tests

sum-of-squares

log-likelihood

Value

Note

Author(s)

References

Examples

Example output

Related to gof in LogisticDx...

R Package Documentation

Browse R Packages

We want your feedback!

LogisticDx Diagnostic Tests for Models with a Binomial Response

gof: Goodness of fit tests for binomial regression In LogisticDx: Diagnostic Tests for Models with a Binomial Response

Description

Usage

Arguments

Details

Hosmer and Lemeshow tests

Osius and Rojek's tests

Stukels tests

sum-of-squares

log-likelihood

Value

Note

Author(s)

References

Examples

Example output

Related to gof in LogisticDx...

R Package Documentation

Browse R Packages

We want your feedback!

LogisticDx
Diagnostic Tests for Models with a Binomial Response

gof: Goodness of fit tests for binomial regression
In LogisticDx: Diagnostic Tests for Models with a Binomial Response