textPredictTest: Significance testing for model prediction performance

View source: R/2_4_1_textPredictTextTrained.R

textPredictTestR Documentation

Significance testing for model prediction performance

Description

Compares predictive performance between two models, using either a paired t-test on errors or bootstrapped comparisons of correlation or AUC.

Usage

textPredictTest(
  y1,
  y2,
  yhat1,
  yhat2,
  method = "t-test",
  statistic = "correlation",
  paired = TRUE,
  event_level = "first",
  bootstraps_times = 10000,
  seed = 20250622,
  ...
)

Arguments

y1

The observed scores (i.e., what was used to predict when training a model).

y2

The second observed scores (default = NULL; i.e., for when comparing models that are predicting different outcomes. In this case a bootstrap procedure is used to create two distributions of correlations that are compared (see description above).

yhat1

The predicted scores from model 1.

yhat2

The predicted scores from model 2 that will be compared with model 1.

method

Character string specifying the comparison approach.

- "t-test": Use when comparing prediction errors from two models predicting the same outcome (only supported when statistic = "correlation"). Performs a paired t-test on absolute errors.

- "bootstrap_difference": Use to compare AUCs from two models predicting either the same outcome (e.g., does Model A outperform Model B on the same classification task) or different outcomes (e.g., mental vs physical health). Bootstraps the difference in AUC across resamples to compute confidence intervals and p-values.

- "bootstrap_overlap": Use when comparing predictions for different outcomes by generating bootstrap distributions of correlation or AUC values (depending on statistic), and testing overlap in distributions. (requires the overlapping package).

Choose the method that aligns with your research question: error comparison on the same outcome, AUC difference testing on the same or different outcomes, or overlap of performance distributions across different outcomes.

statistic

Character ("correlation", "auc") describing statistic to be compared in bootstrapping.

paired

Paired test or not in stats::t.test (default TRUE).

event_level

Character "first" or "second" for computing the auc in the bootstrap.

bootstraps_times

Number of bootstraps (when providing y2).

seed

Set seed.

...

Settings from stats::t.test or overlapping::overlap (e.g., plot = TRUE).

Details

- If 'method = "t-test"' is chosen, the function compares the absolute prediction errors (|yhat - y|) from two models predicting the **same** outcome using a paired t-test. Only 'y1', 'yhat1', and 'yhat2' are required.

- If 'method = "bootstrap_difference"' is chosen, the function compares differences in correlation or AUC between **two outcomes** (or the same outcome if y1 = y2), using bootstrapped resampling. Both 'y1' and 'y2' must be provided (and have the same length).

- If 'method = "bootstrap_overlap"' is chosen, the function generates bootstrap distributions of correlation or AUC values for each outcome and tests the overlap of these distributions, assessing similarity in predictive performance.

Choose the method that aligns with your research question: - Error comparison on the same outcome - Bootstrapped difference testing on the same or different outcomes - Overlap of performance distributions across different outcomes

Value

Comparison of correlations either a t-test or the overlap of a bootstrapped procedure (see $OV).

See Also

see textTrain textPredict

Examples

# Example random data
y1 <- runif(10)
yhat1 <- runif(10)
y2 <- runif(10)
yhat2 <- runif(10)

boot_test <- textPredictTest(y1, y2, yhat1, yhat2)

OscarKjell/text documentation built on July 16, 2025, 9:04 p.m.