| compfit | R Documentation |
Fits multiple regression models and provides a comprehensive comparison table with model quality metrics, convergence diagnostics, and selection guidance. Computes a composite score combining multiple quality metrics to facilitate rapid model comparison and selection.
compfit(
data,
outcome,
model_list,
model_names = NULL,
interactions_list = NULL,
random = NULL,
model_type = "auto",
family = "binomial",
conf_level = 0.95,
p_digits = 3,
include_coefficients = FALSE,
scoring_weights = NULL,
labels = NULL,
number_format = NULL,
verbose = NULL,
...
)
data |
Data frame or data.table containing the dataset. |
outcome |
Character string specifying the outcome variable. For survival
analysis, use |
model_list |
List of character vectors, each containing predictor names for one model. Can also be a single character vector to auto-generate nested models. |
model_names |
Character vector of names for each model. If |
interactions_list |
List of character vectors specifying interaction
terms for each model. Each element corresponds to one model in model_list.
Use |
random |
Character string specifying the random-effects formula for
mixed-effects models ( |
model_type |
Character string specifying model type. If
|
family |
For GLM and GLMER models, specifies the error distribution and link function. Common options include:
For negative binomial, use |
conf_level |
Numeric confidence level for intervals. Default is 0.95. |
p_digits |
Integer specifying the number of decimal places for
p-values. Values smaller than |
include_coefficients |
Logical. If TRUE, includes a second table with coefficient estimates. Default is FALSE. |
scoring_weights |
Named list of scoring weights. Each weight should be
between 0 and 1, and they should sum to 1. Available metrics depend on model
type. If |
labels |
Named character vector providing custom display labels for
variables. Default is |
number_format |
Character string or two-element character vector controlling thousand and decimal separators in formatted output. Named presets:
Or provide a custom two-element vector When
options(summata.number_format = "eu")
|
verbose |
Logical. If |
... |
Additional arguments passed to model fitting functions. |
This function fits all specified models and computes comprehensive quality metrics for comparison. It generates a Composite Model Score (CMS) that combines multiple metrics: lower AIC/BIC (information criteria), higher concordance (discrimination), and model convergence status.
For GLMs, McFadden's pseudo-R-squared is calculated as 1 - (logLik/logLik_null). For survival models, the global p-value comes from the log-rank test.
Models that fail to converge are flagged and penalized in the composite score.
Interaction Terms:
When interactions_list is provided, each element specifies the
interaction terms for the corresponding model in model_list. This is
particularly useful for testing whether adding interactions improves model fit:
Use NULL for models without interactions
Specify interactions using colon notation: c("age:treatment", "sex:stage")
Main effects for all variables in interactions must be in the predictor list
Common pattern: Compare main effects model vs model with interactions
Scoring weights can be customized based on model type:
GLM: "convergence", "aic", "concordance", "pseudo_r2", "brier"
Cox: "convergence", "aic", "concordance", "global_p"
Linear: "convergence", "aic", "pseudo_r2", "rmse"
Default weights emphasize discrimination (concordance) and model fit (AIC).
The composite score is designed as a tool to quickly rank models by their quality metrics. It should be used alongside traditional model selection criteria rather than as a definitive model selection method.
A data.table with class "compfit_result" containing:
Model name/identifier
Composite Model Score for model selection (higher is better)
Sample size
Number of events (for survival/logistic)
Number of predictors
Whether model converged properly
Akaike Information Criterion
Bayesian Information Criterion
^2 / Pseudo-R^2McFadden pseudo-R-squared (GLM)
C-statistic (logistic/survival)
Brier accuracy score (logistic)
Overall model p-value
Attributes include:
List of fitted model objects
Coefficient comparison table (if requested)
Name of recommended model
fit for individual model fitting,
fullfit for automated variable selection,
table2pdf for exporting results
Other regression functions:
fit(),
fullfit(),
multifit(),
print.compfit_result(),
print.fit_result(),
print.fullfit_result(),
print.multifit_result(),
print.uniscreen_result(),
uniscreen()
# Load example data
data(clintrial)
data(clintrial_labels)
# Example 1: Compare nested logistic regression models
models <- list(
base = c("age", "sex"),
clinical = c("age", "sex", "smoking", "diabetes"),
full = c("age", "sex", "smoking", "diabetes", "stage", "ecog")
)
comparison <- compfit(
data = clintrial,
outcome = "os_status",
model_list = models,
model_names = c("Base", "Clinical", "Full")
)
comparison
# Example 2: Compare Cox survival models
library(survival)
surv_models <- list(
simple = c("age", "sex"),
clinical = c("age", "sex", "stage", "grade")
)
surv_comparison <- compfit(
data = clintrial,
outcome = "Surv(os_months, os_status)",
model_list = surv_models,
model_type = "coxph"
)
surv_comparison
# Example 3: Test effect of adding interaction terms
interaction_models <- list(
main = c("age", "treatment", "sex"),
interact = c("age", "treatment", "sex")
)
interaction_comp <- compfit(
data = clintrial,
outcome = "os_status",
model_list = interaction_models,
model_names = c("Main Effects", "With Interaction"),
interactions_list = list(
NULL,
c("treatment:sex")
)
)
interaction_comp
# Example 4: Include coefficient comparison table
detailed <- compfit(
data = clintrial,
outcome = "os_status",
model_list = models,
include_coefficients = TRUE,
labels = clintrial_labels
)
# Access coefficient table
coef_table <- attr(detailed, "coefficients")
coef_table
# Example 5: Access fitted model objects
fitted_models <- attr(comparison, "models")
names(fitted_models)
# Example 6: Get best model recommendation
best <- attr(comparison, "best_model")
cat("Recommended model:", best, "\n")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.