RsqGLM | R Documentation |
This function calculates some (pseudo) R-squared statistics for binomial Generalized Linear Models.
RsqGLM(model = NULL, obs = NULL, pred = NULL, use = "pairwise.complete.obs",
plot = TRUE, plot.type = "lollipop", na.rm = TRUE, rm.dup = FALSE,
verbosity = 2, ...)
model |
a binary-response model object of class "glm". Alternatively, you can input the 'obs' and 'pred' arguments instead of 'model'. |
obs |
alternatively to 'model' and together with 'pred', a vector of observed presences (1) and absences (0) of a binary response variable. Alternatively (and if 'pred' is a 'SpatRaster'), a two-column matrix or data frame containing, respectively, the x (longitude) and y (latitude) coordinates of the presence points, in which case the 'obs' vector will be extracted with |
pred |
alternatively to 'model' and together with 'obs', a numeric vector with the corresponding predicted values of presence probability. Must be of the same length and in the same order as 'obs'. Alternatively (and if 'obs' is a set of point coordinates), a 'SpatRaster' map of the predicted values for the entire evaluation region, in which case the 'pred' vector will be extracted with |
use |
argument to be passed to |
plot |
logical value indicating whether or not to display a bar chart or (by default) a lollipop chart of the calculated measures. |
plot.type |
character value indicating the type of plot to produce (if plot=TRUE). Can be " |
na.rm |
Logical value indicating whether missing values should be ignored in computations. Defaults to TRUE. |
rm.dup |
If |
verbosity |
integer specifying the amount of messages to display. Defaults to the maximum implemented; lower numbers (down to 0) decrease the number of messages. |
... |
additional arguments to pass to the plotting function (see Examples). |
Implemented measures include the R-squared metrics of McFadden (1974), Cox-Snell (1989), Nagelkerke (1991, which corresponds to the corrected Cox-Snell, eliminating its upper bound), and Tjur (2009). See Allison (2014) for a brief review of these measures.
Note that pseudo R-squared values tend to be considerably lower than those of the R-squared for ordinary regression analysis, and they should not be judged by the same standards for a "good fit". For example, for McFadden's R-squared, values of 0.2 to 0.4 represent an excellent fit (McFadden, 1979).
The function returns a named list of the calculated R-squared values.
Tjur's R-squared can only be calculated for models with binomial response variable; otherwise, NA will be returned.
A. Marcia Barbosa
Allison P. (2014) Measures of fit for logistic regression. SAS Global Forum, Paper 1485-2014
Cox, D.R. & Snell E.J. (1989) The Analysis of Binary Data, 2nd ed. Chapman and Hall, London
McFadden, D. (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P. (ed.) Frontiers in Economics. Academic Press, New York
McFadden, D. (1979) Quantitative Methods for Analyzing Travel Behaviour on Individuals: Some Recent Developments. Chapter 15 in Behavioural Travel Modelling. Edited by David Hensher and Peter Stopher.
Nagelkerke, N.J.D. (1991) A note on a general definition of the coefficient of determination. Biometrika, 78: 691-692
Tjur T. (2009) Coefficients of determination in logistic regression models - a new proposal: the coefficient of discrimination. The American Statistician, 63: 366-372.
Dsquared
, AUC
, threshMeasures
, HLfit
# load sample models:
data(rotif.mods)
# choose a particular model to play with:
mod <- rotif.mods$models[[1]]
RsqGLM(model = mod)
# you can also use RsqGLM with vectors of observed and predicted values
# instead of a model object:
RsqGLM(obs = mod$y, pred = mod$fitted.values)
# plotting arguments can be modified:
par(mar = c(6, 3, 2, 1))
RsqGLM(obs = mod$y, pred = mod$fitted.values, col = "seagreen", border = NA,
ylim = c(0, 1), main = "Pseudo-R-squared values")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.