validate | R Documentation |
Performs internal validation of a prediction model development procedure via bootstrapping
or cross-validation. Many model types are supported via the insight
and marginaleffects
packages or users can supply user-defined functions that implement the model development
procedure and retrieve predictions. Bias-corrected scores and estimates of optimism (where applicable)
are provided. See confint.internal_validate
for calculation of confidence intervals.
validate(
fit,
method = c("boot_optimism", "boot_simple", ".632", "cv_optimism", "cv_average", "none"),
data,
outcome,
model_fun,
pred_fun,
score_fun,
B,
...
)
fit |
a model object. If fit is given the |
method |
bias-correction method. Valid options are "boot_optimism", "boot_simple", ".632", "cv_optimism", "cv_average", or "none" (return apparent performance). See details. |
data |
a data.frame containing data used to fit development model |
outcome |
character denoting the column name of the outcome in data |
model_fun |
for models that cannot be supplied via fit this should be a function that takes one named argument: 'data' (function should include ... among arguments). This function should implement the entire model development procedure (hyperparameter tuning, variable selection, imputation etc) and return an object that can be used by pred_fun. Additional arguments can be supplied by ... |
pred_fun |
for models that cannot be supplied via fit this should be a function that takes two named arguments: 'model' and 'data' (function should include ... among arguments). 'model' is an object returned by model_fun. The function should return a vector of predicted risk probabilities of the same length as the number of rows in data. Additional arguments can be supplied by ... |
score_fun |
function used to produce performance measures from predicted risks
and observed binary outcome. Should take two named arguments: 'y' and 'p' (function should include ... among arguments).
This function should return a named vector of scores. If unspecified |
B |
number of bootstrap replicates or crossvalidation folds. If unspecified B is set to 200 for method = "boot_*"/".632", or is set to 10 for method = "cv_*". |
... |
additional arguments for user-defined functions. Arguments for
producing calibration curves can be set via 'calib_args' which should be
a named list (see |
Internal validation can provide bias-corrected estimates of performance (e.g., C-statistic/AUC, calibration intercept/slope)
for a model development procedure (i.e., expected performance if the same procedure were applied
to another sample of the same size from the same population; see references). There are several approaches to producing
bias-corrected estimates (see below). It is important that the fit
or model_fun
provided implement
the entire model development procedure, including any hyperparameter tuning and/or variable selection.
Note that validate
does very little to check for missing values in predictors/features. If fit
is
supplied insight::get_data
will extract the data used to fit the model and usually
this will result in complete cases being used. User-defined model and predict functions can
be specified to handle missing values among predictor variables. Currently any user supplied data will
have rows with missing outcome values removed.
method Different options for the method argument are described below:
(default) estimates optimism for each score and subtracts from apparent score (score calculated
with the original/development model evaluated on the original sample). A new model is fit using the same procedure
using each bootstrap resample. Scores are calculated when applying the boot model to the boot sample (S_{boot}
)
and the original sample (S_{orig}
) and the difference gives an estimate of optimism for a given resample (S_{boot} - S_{orig}
).
The average optimism across the B resamples is subtracted from the apparent score to produce the bias corrected score.
implements the simple bootstrap. B bootstrap models are fit and evaluated on the original data. The average score across the B replicates is the bias-corrected score.
implements Harrell's adaption of Efron's .632 estimator for binary outcomes
(see rms::predab.resample and rms::validate). In this case the estimate of optimism is
0.632 \times (S_{app} - mean(S_{omit} \times w))
where S_{app}
is the apparent performance
score and S_{omit}
is the score estimated using the bootstrap model evaluated on the out-of-sample
observations and w
weights for the proportion of observations omitted (see Harrell 2015, p. 115).
estimate optimism via B-fold crossvalidation. Optimism is the average of the difference
in performance measure between predictions made on the training vs test (held out fold) data. This is the approach
implemented in rms::validate
with method="crossvalidation".
bias corrected scores are the average of scores calculated by assessing the model developed on each fold evaluated on the test/held out data. This approach is described and compared to "boot_optimism" and ".632" in Steyerberg et al. (2001).
Calibration curves
To make calibration curves and calculate the associated estimates (ICI, ECI, etc - see score_binary
)
validate
uses the default arguments in cal_defaults
. These arguments are passed to the pmcalibration
package
(see ?pmcalibration::pmcalibration
for options).
If a calibration plot (apparent vs bias corrected calibration curves via cal_plot
)
is desired, the argument 'eval' should be provided. This should be the points at which to evaluate
the calibration curve on each boot resample or crossvalidation fold. A good option would be
calib_args = list(eval = seq(min(p), max(p), length.out=100))
; where p are predictions from the
original model evaluated on the original data.
Number of resamples/folds is less than requested
If the model_fun
produces an error or if score_binary
is supplied with constant predictions
or outcomes (e.g. all(y == 0)) the returned scores will all be NA. These will be omitted from the calculation
of optimism or other bias-corrected estimates (cv_average, boot_simple) and the number of successful resamples/folds
will be < B. validate
collects error messages and will produce a warning summarizing them. The number of successful
samples is given in the 'n' column in the printed summary of an 'internal_validate' object.
It is important to understand what is causing the loss of resamples/folds. Some potential sources (which will need to be added to) are that
for rare events the resamples/folds may be resulting in samples that have zero outcomes. For 'cv_*' this will especially
be the case if B (n folds) is set high. There may be problems with factor/binary predictor variables with rare levels, which could be dealt with
by specifying a model_fun
that omits variables for the model formula if only one level is present. The issue may be related to the construction
of calibration curves and may be addressed by more carefully selecting settings (see section above).
an object of class internal_validate containing apparent and bias-corrected estimates of performance scores. If method = "boot_*" it also contains results pertaining to stability of predictions across bootstrapped models (see Riley and Collins, 2023).
Steyerberg, E. W., Harrell Jr, F. E., Borsboom, G. J., Eijkemans, M. J. C., Vergouwe, Y., & Habbema, J. D. F. (2001). Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. Journal of clinical epidemiology, 54(8), 774-781.
Harrell Jr F. E. (2015). Regression Modeling Strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. New York: Springer Science, LLC.
Efron (1983). “Estimating the error rate of a prediction rule: improvement on cross-validation”. Journal of the American Statistical Association, 78(382):316-331
Van Calster, B., Steyerberg, E. W., Wynants, L., and van Smeden, M. (2023). There is no such thing as a validated prediction model. BMC medicine, 21(1), 70.
Riley, R. D., & Collins, G. S. (2023). Stability of clinical prediction models developed using statistical or machine learning methods. Biometrical Journal, 65(8), 2200302. doi:10.1002/bimj.202200302
library(pminternal)
set.seed(456)
# simulate data with two predictors that interact
dat <- pmcalibration::sim_dat(N = 2000, a1 = -2, a3 = -.3)
mean(dat$y)
dat$LP <- NULL # remove linear predictor
# fit a (misspecified) logistic regression model
m1 <- glm(y ~ ., data=dat, family="binomial")
# internal validation of m1 via bootstrap optimism with 10 resamples
# B = 10 for example but should be >= 200 in practice
m1_iv <- validate(m1, method="boot_optimism", B=10)
m1_iv
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.