View source: R/2_4_1_textPredictTextTrained.R
textPredictTest | R Documentation |
Compares predictive performance between two models, using either a paired t-test on errors or bootstrapped comparisons of correlation or AUC.
textPredictTest(
y1,
y2,
yhat1,
yhat2,
method = "t-test",
statistic = "correlation",
paired = TRUE,
event_level = "first",
bootstraps_times = 10000,
seed = 20250622,
...
)
y1 |
The observed scores (i.e., what was used to predict when training a model). |
y2 |
The second observed scores (default = NULL; i.e., for when comparing models that are predicting different outcomes. In this case a bootstrap procedure is used to create two distributions of correlations that are compared (see description above). |
yhat1 |
The predicted scores from model 1. |
yhat2 |
The predicted scores from model 2 that will be compared with model 1. |
method |
Character string specifying the comparison approach. - "t-test": Use when comparing prediction errors from two models predicting the same outcome (only supported when statistic = "correlation"). Performs a paired t-test on absolute errors. - "bootstrap_difference": Use to compare AUCs from two models predicting either the same outcome (e.g., does Model A outperform Model B on the same classification task) or different outcomes (e.g., mental vs physical health). Bootstraps the difference in AUC across resamples to compute confidence intervals and p-values. - "bootstrap_overlap": Use when comparing predictions for different outcomes by generating bootstrap distributions of correlation or AUC values (depending on statistic), and testing overlap in distributions. (requires the overlapping package). Choose the method that aligns with your research question: error comparison on the same outcome, AUC difference testing on the same or different outcomes, or overlap of performance distributions across different outcomes. |
statistic |
Character ("correlation", "auc") describing statistic to be compared in bootstrapping. |
paired |
Paired test or not in stats::t.test (default TRUE). |
event_level |
Character "first" or "second" for computing the auc in the bootstrap. |
bootstraps_times |
Number of bootstraps (when providing y2). |
seed |
Set seed. |
... |
Settings from stats::t.test or overlapping::overlap (e.g., plot = TRUE). |
- If 'method = "t-test"' is chosen, the function compares the absolute prediction errors (|yhat - y|) from two models predicting the **same** outcome using a paired t-test. Only 'y1', 'yhat1', and 'yhat2' are required.
- If 'method = "bootstrap_difference"' is chosen, the function compares differences in correlation or AUC between **two outcomes** (or the same outcome if y1 = y2), using bootstrapped resampling. Both 'y1' and 'y2' must be provided (and have the same length).
- If 'method = "bootstrap_overlap"' is chosen, the function generates bootstrap distributions of correlation or AUC values for each outcome and tests the overlap of these distributions, assessing similarity in predictive performance.
Choose the method that aligns with your research question: - Error comparison on the same outcome - Bootstrapped difference testing on the same or different outcomes - Overlap of performance distributions across different outcomes
Comparison of correlations either a t-test or the overlap of a bootstrapped procedure (see $OV).
see textTrain
textPredict
# Example random data
y1 <- runif(10)
yhat1 <- runif(10)
y2 <- runif(10)
yhat2 <- runif(10)
boot_test <- textPredictTest(y1, y2, yhat1, yhat2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.