summary_stats: Calculate summary statistics for list of models
In jasongraf1/VADIS: Variation-Based Distance & Similarity Modeling

summary_stats

R Documentation

Calculate summary statistics for list of models

Usage

summary_stats(model_list, data_list = NULL, response = NULL)

Arguments

`model_list`	a list of model objects
`data_list`	a list od dataframes
`response`	the names of the response column in the data

Value

a dataframe with one or more of the following columns

N: The number of observations in the dataset
baseline: The baseline accuracy of the dataset
predicted.corr: The proportion of observations correctly predicted by the model
Brier: The Brier score of model accuracy. Only available for models that return predicted probabilities.
C: The Concordance index (see Harrell 2015:256-258)
LogScore: The cross-entropy loss, or log loss, score, which measures the performance of a classification model whose output is a probability value between 0 and 1. Only available for models that return predicted probabilities.
AIC: The Akaike Information Criterion. Only given for regression models fit with glm and glmer.
WAIC: The Widely Applicable Information Criterion, or Watanabe–Akaike Information Criterion. Only given for models of class brmsfit.
Max.VIF: The maximal variance inflation factor obtained from the covariance matrix of parameter estimates in the model using the method of Davis et al. (1986). An indication of multicollinearity. Only given for regression models fit with glm and glmer.
kappa: The condition number calculated from the model matrix (with the intercept included), following Belsley et al. (1980). Only given for regression models fit with glm and glmer.
HosLem.p: The p-value from the Hosmer-Lemeshow goodness of fit test for logistic regression. Values below .05 indicate evidence of poor model fit. Only given for regression models fit with glm and glmer.
elpd_loo: The Bayesian leave-one-out (LOO) estimate of the expected log pointwise predictive density (ELPD). See loo and Vehtari et al. (2017) and https://avehtari.github.io/modelselection/CV-FAQ.html for details. Only given for models of class brmsfit.
p_loo: The effective number of parameters. See loo. Only given for models of class brmsfit.
looic: The LOO information criterion, which is calculated as (--2 * elpd_loo). See loo. Only given for models of class brmsfit.} } } { Calculate summary statistics for list of models } { ## Not run: data_list <- vector("list") for (i in 1:4){ df <- data.frame(x1 = rnorm(100)) df$x2 <- df$x1+.1*rnorm(100) df$x3 <- df$x2 + .5*rnorm(100) df$y <- rbinom(100, 1, 1/(1 + exp(-df$x1+df$x2+df$x3))) data_list[[i]] <- df } rm_list <- lapply(data_list, FUN = function(d) glm(y ~ ., data = d, family = binomial, x = T)) summary_stats(rm_list) ## End(Not run) } { Belsley, D. A. and Kuh, E. and Welsch, R. E. 1980. Regression Diagnostics. Identifying Influential Data and Sources of Collinearity, Wiley Series in Probability and Mathematical Statistics, New York. Davis, C. E., Hyde, J. E., Bangdiwala, S. I., & Nelson, J. J. 1986. An example of dependencies among variables in a conditional logistic regression. Modern statistical methods in chronic disease epidemiology 140. 147. Harrell, Frank E. 2015. Regression Modeling Strategies. 2nd edn. New York: Springer. Vehtari, Aki, Andrew Gelman & Jonah Gabry. 2017. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing 27(5). 1413–1432. }