# logitgof: Hosmer-Lemeshow Tests for Logistic Regression Models In generalhoslem: Goodness of Fit Tests for Logistic Regression Models

## Description

Performs the Hosmer-Lemeshow goodness of fit tests for binary, multinomial and ordinal logistic regression models.

## Usage

 `1` ```logitgof(obs, exp, g = 10, ord = FALSE) ```

## Arguments

 `obs` a vector of observed values. See details. `exp` expected values fitted by the model. See details. `g` number of quantiles of risk, 10 by default. `ord` logical indicating whether to run the ordinal version, FALSE by default.

## Details

The Hosmer-Lemeshow tests The Hosmer-Lemeshow tests are goodness of fit tests for binary, multinomial and ordinal logistic regression models. `logitgof` is capable of performing all three. Essentially, they compare observed with expected frequencies of the outcome and compute a test statistic which is distributed according to the chi-squared distribution. The degrees of freedom depend upon the number of quantiles used and the number of outcome categories. A non-significant p value indicates that there is no evidence that the observed and expected frequencies differ (i.e., evidence of good fit).

Binary version If `obs` is a vector of 1s and 0s or a factor vector with 2 levels, then the binary version of the test is run. `exp` must be the fitted values obtained from the model, which can be accessed using the `fitted()` function.

Multinomial version If `obs` is a factor with three or more levels and `ord = FALSE`, the multinomial version of the test is run. If using the `mlogit` package to run a model, ensure `outcome = FALSE` in the `fitted()` function. See examples.

Ordinal version If `obs` is a factor with three or more levels and `ord = TRUE`, the ordinal version of the test is run. See examples for how to extract fitted values from models constructed using `MASS::polr` or `oridinal::clm`.

Note that Fagerland and Hosmer (2013) point out that the model needs to have at least as many covariate patterns as groups. This is achieved easily where there are continuous predictors or several categorical variables. This test will not be valid where there is only one or two categorical predictor variables. Fagerland and Hosmer (2016) also recommend running the Hosmer-Lemeshow test for ordinal models alongisde the Lipsitz test (`lipsitz.test`) and Pulkstenis-Robinson tests (`pulkrob.chisq` and `pulkrob.deviance`), as each detects different types of lack of fit.

Finally, it has been observed that the results from this implementation of the binary and ordinal Hosmer-Lemeshow tests and the Lipsitz test are slightly different from the Stata implementations. It is not not yet clear why this is but is under investigation.

## Value

A list of class `htest` containing:

 `statistic` the value of the relevant test statistic. `parameter` the number of degrees of freedom used. `p.value` the p-value. `method` a character string indicating whether the binary or multinomial version of the test was performed. `data.name` a character string containing the names of the data passed to `obs` and `exp`. `observed` a table of observed frequencies with `g` rows. Either an `xtabs` generated table (used in the binary version) or a `cast` generated data frame (multinomial version). `expected` a table of expected frequencies with `g` rows. Either an `xtabs` generated table or a `cast` generated data frame. `stddiffs` a table of the standardised differences. See Hosmer, Lemeshow and Sturdivant (2013), p 162.

## Author(s)

Matthew Alexander Jay, with code adapted from `ResourceSelection::hoslem.test` by Peter Solymos.

## References

• Fagerland MW, Hosmer DW, Bofin AM. Multinomial goodness-of-fit tests for logistic regression models. Statistics in Medicine 2008;27(21):4238-53.

• Fagerland MW, Hosmer DW. A goodness-of-fit test for the proportional odds regression model. Statistics in Medicine 2013;32:2235-2249.

• Fagerland MW, Hosmer DW. Tests for goodness of fit in ordinal logistic regression models. Journal of Statistical Computation and Simulation 2016. DOI: 10.1080/00949655.2016.1156682.

• Fagerland MW, Hosmer DW. How to test for goodness of fit in ordinal logistic regression models. The Stata Journal 2017;17(3):668-686.

• Hosmer DW, Lemeshow S, Sturdivant RX. Applied Logistic Regression, 3rd Edition. 2013. New York, USA: John Wiley and Sons.

`lipsitz.test`, `pulkrob.chisq`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38``` ```### The mtcars dataset is a terrible example to use due to its small ### size - some of the models will return warnings as a result. ## Binary model # 1/0 coding data(mtcars) mod1 <- glm(vs ~ cyl + mpg, data = mtcars, family = binomial) logitgof(mtcars\$vs, fitted(mod1)) # factor name coding mtcars\$engine <- factor(ifelse(mtcars\$vs==0, "V", "S"), levels = c("V", "S")) mod2 <- glm(engine ~ cyl + mpg, data = mtcars, family = binomial) logitgof(mtcars\$engine, fitted(mod2)) ## Multinomial model # with nnet library(nnet) mod3 <- multinom(gear ~ mpg + cyl, data = mtcars) logitgof(mtcars\$gear, fitted(mod3)) # with mlogit library(mlogit) data("Fishing", package = "mlogit") Fish <- mlogit.data(Fishing, varying = c(2:9), shape = "wide", choice = "mode") mod4 <- mlogit(mode ~ 0 | income, data = Fish) logitgof(Fishing\$mode, fitted(mod4, outcome = FALSE)) ## Ordinal model # polr in package MASS mod5 <- polr(as.factor(gear) ~ mpg + cyl, data = mtcars) logitgof(mtcars\$gear, fitted(mod5), g = 5, ord = TRUE) # clm in package ordinal library(ordinal) mtcars\$gear <- as.factor(mtcars\$gear) mod6 <- clm(gear ~ mpg + cyl, data = mtcars) predprob <- data.frame(mpg = mtcars\$mpg, cyl = mtcars\$cyl) fv <- predict(mod6, newdata = predprob, type = "prob")\$fit logitgof(mtcars\$gear, fv, g = 5, ord = TRUE) ```

### Example output

```Loading required package: reshape

Hosmer and Lemeshow test (binary model)

data:  mtcars\$vs, fitted(mod1)
X-squared = 8.8633, df = 8, p-value = 0.354

Warning message:
In logitgof(mtcars\$vs, fitted(mod1)) :
At least one cell in the expected frequencies table is < 1. Chi-square approximation may be incorrect.

Hosmer and Lemeshow test (binary model)

data:  mtcars\$engine, fitted(mod2)
X-squared = 8.8633, df = 8, p-value = 0.354

Warning message:
In logitgof(mtcars\$engine, fitted(mod2)) :
At least one cell in the expected frequencies table is < 1. Chi-square approximation may be incorrect.
# weights:  12 (6 variable)
initial  value 35.155593
iter  10 value 21.978836
iter  20 value 21.773447
iter  30 value 21.761921
final  value 21.761297
converged

Hosmer and Lemeshow test (multinomial model)

data:  mtcars\$gear, fitted(mod3)
X-squared = 13.501, df = 16, p-value = 0.6358

Warning message:
In logitgof(mtcars\$gear, fitted(mod3)) :
At least one cell in the expected frequencies table is < 1. Chi-square approximation may be incorrect.

Please cite the 'maxLik' package as:
Henningsen, Arne and Toomet, Ott (2011). maxLik: A package for maximum likelihood estimation in R. Computational Statistics 26(3), 443-458. DOI 10.1007/s00180-010-0217-1.

If you have questions, suggestions, or comments regarding the 'maxLik' package, please use a forum or 'tracker' at maxLik's R-Forge site:
https://r-forge.r-project.org/projects/maxlik/

Hosmer and Lemeshow test (multinomial model)

data:  Fishing\$mode, fitted(mod4, outcome = FALSE)
X-squared = 28.781, df = 18, p-value = 0.05113

Warning message:
In logitgof(Fishing\$mode, fitted(mod4, outcome = FALSE)) :
Not possible to compute 10 rows. There might be too few observations.

Hosmer and Lemeshow test (ordinal model)

data:  mtcars\$gear, fitted(mod5)
X-squared = 16.273, df = 7, p-value = 0.02274

Warning message:
In logitgof(mtcars\$gear, fitted(mod5), g = 5, ord = TRUE) :
At least one cell in the expected frequencies table is < 1. Chi-square approximation may be incorrect.

Hosmer and Lemeshow test (ordinal model)

data:  mtcars\$gear, fv
X-squared = 16.273, df = 7, p-value = 0.02274

Warning message:
In logitgof(mtcars\$gear, fv, g = 5, ord = TRUE) :
At least one cell in the expected frequencies table is < 1. Chi-square approximation may be incorrect.
```

generalhoslem documentation built on June 3, 2019, 5:03 p.m.