knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, dpi = 150 ) # Use ragg for better font rendering if available if (requireNamespace("ragg", quietly = TRUE)) { knitr::opts_chunk$set(dev = "ragg_png") } old_opts <- options(width = 180)
Model selection is a fundamental problem in applied statistics. Given a set of candidate models, it is important to determine which specification best balances goodness-of-fit against parsimony. Traditional approaches rely on information criteria (AIC, BIC), discrimination metrics (C-statistic), and hypothesis tests for nested models.
The compfit() function synthesizes these metrics into a weighted Composite Model Score (CMS) to facilitate systematic comparison between models. Like other functions in this package, it follows the standard summata input paradigm, with an additional parameter to allow for multiple models:
compfit(data, outcome, model_list, model_type, ...)
where data is the dataset, outcome is the common endpoint variable, model_list is a named list of predictor vectors (each representing a different model to be compared), and model_type is the type of model to be implemented. This vignette demonstrates the various capabilities of this function using the included sample dataset.
The examples in this vignette use the clintrial dataset included with summata:
library(summata) data(clintrial) data(clintrial_labels)
The compfit() function evaluates models using several quality metrics, then combines them into a single Composite Model Score (CMS) ranging from 0 to 100 for rapid comparison. The metrics available depend on the model type.
| Metric | Used by | Interpretation | Better | |:-------|:--------|:---------------|:-------| | AIC | All | Information criterion balancing fit and complexity | Lower | | BIC | Fallback | Information criterion with stronger complexity penalty | Lower | | Concordance | GLM, Cox, GLMER, Cox ME | Discrimination ability (0.5 = random, 1.0 = perfect) | Higher | | Pseudo-R² | GLM, Cox ME | Proportion of variation explained (McFadden's) | Higher | | R² | LM | Proportion of variance explained | Higher | | Brier Score | GLM | Prediction accuracy for binary outcomes (0 = perfect) | Lower | | Global p | Cox | Omnibus likelihood ratio test p-value | Lower | | RMSE | LM | Root mean squared error of residuals | Lower | | Marginal R² | LMER, GLMER | Variance explained by fixed effects alone | Higher | | Conditional R² | LMER | Variance explained by fixed + random effects | Higher | | ICC | LMER, GLMER, Cox ME | Intraclass correlation; proportion of variance from clustering | Moderate | | Convergence | All | Whether the model converged successfully | Yes |
| Range | Interpretation | |:------|:---------------| | 85–100 | Excellent | | 75–84 | Very Good | | 65–74 | Good | | 55–64 | Fair | | < 55 | Poor |
The score components and their default weights vary by model type.
| Component | LM | GLM | Cox | LMER | GLMER | Cox ME | |:----------|:--:|:---:|:---:|:----:|:-----:|:------:| | Convergence | 15% | 15% | 15% | 20% | 15% | 15% | | AIC | 25% | 25% | 30% | 25% | 25% | 30% | | R² | 45% | | | | | | | Pseudo-R² | | 15% | | | | 10% | | Concordance | | 40% | 40% | | 30% | 35% | | Brier Score | | 5% | | | | | | Global p | | | 15% | | | | | RMSE | 15% | | | | | | | Marginal R² | | | | 25% | 15% | | | Conditional R² | | | | 15% | | | | ICC | | | | 15% | 15% | 10% |
Compare models with increasing complexity:
example1 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Demographics" = c("age", "sex"), "Plus Stage" = c("age", "sex", "stage", "ecog"), "Full Model" = c("age", "sex", "stage", "ecog", "treatment", "smoking") ), model_type = "glm" ) example1
For continuous outcomes, use model_type = "lm":
example2 <- compfit( data = clintrial, outcome = "los_days", model_list = list( "Simple" = c("age", "sex"), "Disease" = c("age", "sex", "stage", "ecog"), "Treatment" = c("age", "sex", "stage", "ecog", "surgery", "treatment") ), model_type = "lm" ) example2
For time-to-event outcomes, use model_type = "coxph":
example3 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Unadjusted" = c("treatment"), "Demographics" = c("treatment", "age", "sex"), "Full" = c("treatment", "age", "sex", "stage", "ecog") ), model_type = "coxph" ) example3
For count outcomes, use model_type = "glm" with a count model family (e.g., family = "poisson"):
example4 <- compfit( data = clintrial, outcome = "fu_count", model_list = list( "Minimal" = c("age", "treatment"), "Clinical" = c("age", "treatment", "stage", "ecog"), "Full" = c("age", "treatment", "stage", "ecog", "surgery", "diabetes") ), model_type = "glm", family = "poisson", labels = clintrial_labels ) example4
A key application of compfit() is testing whether interaction terms improve model fit.
Compare a main-effects model to one with an interaction term:
example5 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Main Effects" = c("age", "treatment", "sex"), "With Interaction" = c("age", "treatment", "sex") ), interactions_list = list( NULL, c("sex:treatment") ), model_type = "glm" ) example5
Compare different interaction hypotheses:
example6 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Main Effects" = c("age", "treatment", "sex", "stage"), "Age × Treatment" = c("age", "treatment", "sex", "stage"), "Sex × Treatment" = c("age", "treatment", "sex", "stage"), "Both" = c("age", "treatment", "sex", "stage") ), interactions_list = list( NULL, c("age:treatment"), c("sex:treatment"), c("age:treatment", "sex:treatment") ), model_type = "coxph" ) example6
Setting include_coefficients = TRUE generates a table comparing coefficients across models:
example7 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Model A" = c("age", "sex"), "Model B" = c("age", "sex", "stage"), "Model C" = c("age", "sex", "stage", "treatment") ), model_type = "glm", include_coefficients = TRUE, labels = clintrial_labels ) # Main comparison example7 # Coefficient table coef_table <- attr(example7, "coefficients") coef_table
The underlying model objects are stored as attributes for further analysis:
models <- attr(example7, "models") names(models) # Examine a specific model summary(models[["Model C"]])
The best-performing model is identified automatically based on the Composite Model Score (CMS):
example9 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Minimal" = c("treatment"), "Standard" = c("treatment", "age", "sex", "stage"), "Extended" = c("treatment", "age", "sex", "stage", "ecog", "grade") ), model_type = "coxph" ) recommended <- attr(example9, "best_model") cat("Recommended model:", recommended, "\n")
Modify the CMS scoring weights to emphasize specific metrics:
example10 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Simple" = c("age", "sex"), "Standard" = c("age", "sex", "stage"), "Full" = c("age", "sex", "stage", "treatment", "ecog") ), model_type = "glm", scoring_weights = list( convergence = 0.05, aic = 0.20, concordance = 0.60, pseudo_r2 = 0.10, brier = 0.05 ) ) example10
Evaluate the impact of covariate adjustment on effect estimates:
scenario1 <- compfit( data = clintrial, outcome = "Surv(os_months, os_status)", model_list = list( "Crude" = c("treatment"), "Age-Sex Adjusted" = c("treatment", "age", "sex"), "Fully Adjusted" = c("treatment", "age", "sex", "stage", "ecog") ), model_type = "coxph", include_coefficients = TRUE, labels = clintrial_labels ) scenario1 # Compare treatment effect across models attr(scenario1, "coefficients")
Compare automated versus theory-driven selection:
# Identify candidates via screening screening <- uniscreen( data = clintrial, outcome = "surgery", predictors = c("age", "sex", "bmi", "smoking", "diabetes", "hypertension", "stage", "ecog", "treatment"), model_type = "glm", p_threshold = 0.10 ) # Extract significant predictors sig_vars <- attr(screening, "significant") scenario2 <- compfit( data = clintrial, outcome = "surgery", model_list = list( "Theory-Driven" = c("age", "sex", "stage", "treatment"), "Data-Driven" = sig_vars, "Combined" = unique(c("age", "sex", "stage", "treatment", sig_vars)) ), model_type = "glm" ) scenario2
Test whether additional predictors meaningfully improve fit:
scenario3 <- compfit( data = clintrial, outcome = "los_days", model_list = list( "3 Predictors" = c("age", "surgery", "ecog"), "5 Predictors" = c("age", "surgery", "ecog", "stage", "treatment"), "8 Predictors" = c("age", "surgery", "ecog", "stage", "treatment", "sex", "smoking", "diabetes") ), model_type = "lm", labels = clintrial_labels ) scenario3
Comparison tables can be exported to various formats:
# Main comparison table table2docx( table = example1, file = file.path(tempdir(), "Model_Comparison.docx"), caption = "Table 3. Model Comparison Results" ) # Coefficient table table2docx( table = attr(example6, "coefficients"), file = file.path(tempdir(), "Coefficient_Comparison.docx"), caption = "Table S1. Coefficient Estimates Across Models" ) # PDF with landscape orientation for wide tables table2pdf( table = example1, file = file.path(tempdir(), "Model_Comparison.pdf"), caption = "Model Comparison", orientation = "landscape" )
Check convergence status and consider simplifying non-converging models:
# Check convergence status comparison[, .(Model, Converged)] # For non-converging models: # 1. Reduce complexity # 2. Check for separation (logistic) # 3. Examine predictor correlations
When encountering non-converging models, consider reducing complexity, examining predictor correlations, and checking for perfect separation (primarily in logistic models).
Models with missing data may have different sample sizes, which complicates comparison:
# Check sample sizes comparison[, .(Model, N, Events)] # Use complete cases for fair comparison complete_data <- na.omit(data[, relevant_vars, with = FALSE])
When scores are similar, prefer parsimony and consider domain knowledge:
# Examine individual metrics comparison[, .(Model, `Composite Model Score (CMS)`, AIC, Concordance)] # Prefer parsimony when scores are close # Consider interpretability
options(old_opts)
desctable() for baseline characteristicsfit(), uniscreen(), and fullfit()multifit() for multi-outcome analysisAny scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.