Description Usage Arguments Details Value Note Author(s) References Examples
Goodness of fit tests for binomial regression
1 2 3 4 |
x |
A regression model with class |
... |
Additional arguments when plotting the
receiver-operating curve. See:
|
g |
Number of groups (quantiles) into which to split observations for the Hosmer-Lemeshow and the modified Hosmer-Lemeshow tests. |
plotROC |
Plot a receiver operating curve? |
Details of the elements in the returned list
follow below:
ct:
A contingency table, similar to the output of dx
.
The following are given per covariate group:
n | number of observations |
y1hat | predicted number of observations with y=1 |
y1 | actual number of observations with y=1 |
y0hat | predicted number of observations with y=0 |
y0 | actual number of observations with y=0 |
chiSq:
Chi-squared tests of the significance
of the model.
Pearsons test and the deviance D test are given.
These are calculated by indididual I
, by covariate group G
and also from the contingency table CT
above.
They are calculated as:
Chisq = SUM Pr[i]^2
and
D = SUM dr[i]^2
The statistics should follow a
chiSq distribution with
n - p degrees of freedom.
Here, n is the number of observations
(taken individually or by covariate group) and p
is the number
pf predictors in the model.
A high p value for the test suggests
that the model is a poor fit.
The assumption of a chiSq distribution
is most valid when
observations are considered by group.
The statistics from the contingency table should be
similar to those obtained
when caluclated by group.
ctHL:
The contingency table for the Hosmer-Lemeshow test.
The observations are ordered by probability, then
grouped into g
groups of approximately equal size.
The columns are:
P | the probability |
y1 | the actual number of observations with y=1 |
y1hat | the predicted number of observations with y=1 |
y0 | the actual number of observations with y=0 |
y0hat | the predicted number of observations with y=0 |
n | the number of observations |
Pbar | the mean probability, which is (n * P) / SUM(n) |
gof:
All of these tests rely on assessing the effect of
adding an additional variable to the model.
Thus a low p value for any of these tests
implies that the model is a poor fit.
Hosmer and Lemeshows C statistic is based on: y[k], the number of observations where y=1, n[k], the number of observations and Pbar[k], the average probability in group k:
Pbar[k] = SUM(i) n[i]P[i] / n[k], k=1,2...g
The statistic is:
C = SUM(k=1...g) (y[k] - n[k]Pbar[k])^2 / n[k]Pbar[k](1-Pbar[k])
This should follow a chiSq distribution
with g - 2
degrees of freedom.
The modified Hosmer and Lemeshow test is assesses the
change in model deviance D when G
is added as a predictor.
That is, a linear model is fit as:
dr[i] ~ G, dr[i] = deviance residual
and the effect of adding G assessed with anova(lm(dr ~ G))
.
These are based on a power-divergence statistic PD[l] (l=1 for Pearsons test) and the standard deviation (herein, of a binomial distribution) SD. The statistic is:
Z[OR] = PD[l] - lbar / SD[l]
For logistic regression, it is calculated as:
Z[OR] = (chiSq - (n - p)) / (2 * n * SUM 1/n[i])^0.5
where RSS is the residual sum-of-squares from a weighted linear regression:
(1 - 2 * P[i]) / SD[i] ~ X, weights = SD[i]
Here \bold{X} is the matrix of model predictors.
A two-tailed test against a standard normal distribution N ~ (0, 1)
should not be significant.
These are based on the addition of the vectors:
z[1] = Pgeq0.5 = sign(P[i] >= 0.5)
and / or
z[2] = Pl0.5 = sign(P[i] < 0.5)
to the existing model predictors.
The model fit is compared to the original using the
score (e.g. SstPgeq0.5
) and likelihood-ratio (e.g. SllPl0.5
) tests.
These models should not be a significantly better fit to the data.
R2:
Pseudo-R^2 comparisons of the predicted values from the
fitted model vs. an intercept-only model.
The sum-of-squres (linear-regression) measure based on the squared Pearson correlation coefficient by individual is based on the mean probability:
Pbar = SUM(n[i]) / n
and is given by:
R2[ssI] = 1 - SUM(y[i] - P[i])^2 / SUM(y[i] - Pbar)^2
The same measure, by covariate group, is:
R2[ssG] = 1 - SUM(y[i] - n[i] * P[i])^2 / SUM(y[i] - n[i] * Pbar)^2
The log-likelihood based R^2 measure per individual is based on:
ll[0], the log-likelihood of the intercept-only model
ll[p], the log-likelihood of the model with p covariates
It is calculated as
R2[llI] = (ll[0] - ll[p]) / ll[0]
This measure per covariate group is based on ll[s], the log-likelihood for the saturated model, which is calculated from the model deviance D:
ll[s] = ll[p] - D / 2
It is cacluated as:
R2[llG] = (ll[0] - ll[p]) / (ll[0] - ll[s])
auc:
The area under the receiver-operating curve.
This may broadly be interpreted as:
auc | Discrimination |
auc=0.5 | useless |
0.7 <= auc < 0.8 | acceptable |
0.8 <= auc < 0.9 | excellent |
auc >= 0.9 | outstanding |
auc >= 0.9 occurs rarely
as this reuqires almost
complete separation/ perfect classification.
A list
of data.table
s as follows:
ct |
Contingency table. | ||||||||||||
chiSq |
Chi-squared tests of the significance of the model. The tests are:
| ||||||||||||
ctHL |
Contingency table for the Hosmer-Lemeshow test. | ||||||||||||
gof |
Goodness-of-fit tests. These are:
| ||||||||||||
R2 |
R-squared like tests:
| ||||||||||||
auc |
Area under the receiver-operating curve (ROC) with 95 % CIs. |
Additionally, if plotROC=TRUE
, a plot of the ROC.
The returned list
has the additional
class
of "gof.glm"
.
The print
method for this class
shows
only those results
which have a p value.
Modified Hosmer & Lemeshow goodness of fit test: adapted from existing work by Yongmei Ni.
Osius G & Rojek D, 1992.
Normal goodness-of-fit tests for multinomial models
with large degrees of freedom.
Journal of the American Statistical Association.
87(420):1145-52.
JASA (paywall).
JSTOR (free):
http://www.jstor.org/stable/2290653
Hosmer D, Hosmer T, Le Cessie S & Lemeshow S (1997). A comparison of goodness-of-fit tests for the logistic regression model. Statistics in Medicine. 16(9):965-80. Wiley (paywall). Duke University (free).
Mittlboch M, Schemper M (1996). Explained variation for logistic regression. Statistics in Medicine. 15(19):1987-97. Wiley (paywall). CiteSeerX / Penn State University (free).
1 2 3 4 5 6 7 8 9 10 11 12 13 | ## H&L 2nd ed. Sections 5.2.2, 5.2.4, 5.2.5. Pages 147-167.
data(uis)
uis <- within(uis, {
NDRGFP1 <- 10 / (NDRGTX + 1)
NDRGFP2 <- NDRGFP1 * log((NDRGTX + 1) / 10)
})
g1 <- gof(glm(DFREE ~ AGE + NDRGFP1 + NDRGFP2 + IVHX +
RACE + TREAT + SITE +
AGE:NDRGFP1 + RACE:SITE,
family=binomial, data=uis), plot=FALSE)
g1
unclass(g1)
attributes(g1$gof)
|
chiSq df pVal
PrI 580.74 564 0.30389
drI 597.96 564 0.15592
PrG 511.78 510 0.46949
drG 530.74 510 0.25410
PrCT 511.78 510 0.46949
drCT 530.74 510 0.25410
val df pVal
HL chiSq 4.394114 8 0.819931
mHL F 0.763346 9 0.650553
OsRo Z 0.115106 NA 0.908361
SstPgeq0.5 Z 1.297926 NA 0.194313
SstPl0.5 Z 0.107475 NA 0.914412
SstBoth chiSq 1.696164 2 0.428236
SllPgeq0.5 chiSq 3.387409 1 0.065696 .
SllPl0.5 chiSq 0.011483 1 0.914662
SllBoth chiSq 3.953985 2 0.138485
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
$ct
n y1hat y1 y0hat y0
1: 1 0.03236867 0 0.9676313 1
2: 1 0.02857757 0 0.9714224 1
3: 2 0.92867458 1 1.0713254 1
4: 1 0.04242218 0 0.9575778 1
5: 1 0.04048063 0 0.9595194 1
---
517: 1 0.22337160 1 0.7766284 0
518: 1 0.22004414 1 0.7799559 0
519: 1 0.16759820 1 0.8324018 0
520: 1 0.16262781 1 0.8373722 0
521: 1 0.03262591 1 0.9673741 0
$chiSq
test chiSq df pVal
1: PrI 580.7352 564 0.3038915
2: drI 597.9629 564 0.1559194
3: PrG 511.7807 510 0.4694862
4: drG 530.7412 510 0.2541018
5: PrCT 511.7807 510 0.4694862
6: drCT 530.7412 510 0.2541018
$ctHL
P y1 y1hat y0 y0hat n Pbar
1: 0.0939 4 4.099756 54 53.90024 58 0.07068545
2: 0.125 5 6.229210 52 50.77079 57 0.10928438
3: 0.163 8 8.538861 50 49.46114 58 0.14722174
4: 0.204 11 10.401876 46 46.59812 57 0.18248905
5: 0.234 16 12.705793 42 45.29421 58 0.21906540
6: 0.279 11 14.463885 46 42.53611 57 0.25375237
7: 0.324 18 17.506976 40 40.49302 58 0.30184442
8: 0.376 24 19.796622 33 37.20338 57 0.34730917
9: 0.459 23 23.915891 35 34.08411 58 0.41234295
10: 0.728 27 29.341129 30 27.65887 57 0.51475666
$gof
test stat val df pVal
1: HL chiSq 4.39411425 8 0.81993079
2: mHL F 0.76334613 9 0.65055345
3: OsRo Z 0.11510645 NA 0.90836075
4: SstPgeq0.5 Z 1.29792647 NA 0.19431260
5: SstPl0.5 Z 0.10747461 NA 0.91441247
6: SstBoth chiSq 1.69616390 2 0.42823552
7: SllPgeq0.5 chiSq 3.38740879 1 0.06569612
8: SllPl0.5 chiSq 0.01148318 1 0.91466236
9: SllBoth chiSq 3.95398469 2 0.13848513
$R2
method R2
1: ssI 0.09464760
2: ssG 0.09959974
3: llI 0.08530447
4: llG 0.09508152
$auc
auc lower 95% CI upper 95% CI
69.89081 65.13074 74.65087
attr(,"interpret")
[1] "auc = 0.5 --> useless" "0.7 < auc < 0.8 --> good"
[3] "0.8 < auc < 0.9 --> excellent"
attr(,"link")
[1] "logit"
$names
[1] "test" "stat" "val" "df" "pVal"
$row.names
[1] 1 2 3 4 5 6 7 8 9
$class
[1] "data.table" "data.frame"
$.internal.selfref
<pointer: 0x1e37b88>
$interpret
[1] "Low p value: reject H0 that the model is a good fit."
[2] "I.e. model is a poor fit."
$g
[1] "10 groups"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.