View source: R/boot_optimism.R
boot_optimism | R Documentation |
Estimate bias-corrected scores via calculation of bootstrap optimism (standard or .632).
Can also produce estimates for assessing the stability of prediction model predictions.
This function is called by validate
.
boot_optimism(
data,
outcome,
model_fun,
pred_fun,
score_fun,
method = c("boot", ".632"),
B = 200,
...
)
data |
the data used in developing the model. Should contain all variables considered (i.e., even those excluded by variable selection in the development sample) |
outcome |
character denoting the column name of the outcome in |
model_fun |
a function that takes at least one argument, |
pred_fun |
function that takes at least two arguments, |
score_fun |
a function to calculate the metrics of interest. If this is not specified |
method |
"boot" or ".632". The former estimates bootstrap optimism for each score and subtracts
from apparent scores (simple bootstrap estimates are also produced as a by-product).
The latter estimates ".632" optimism as described in Harrell (2015). See |
B |
number of bootstrap resamples to run |
... |
additional arguments for |
a list of class internal_boot
containing:
apparent
- scores calculated on the original data using the original model.
optimism
- estimates of optimism for each score (average difference in score for bootstrap models evaluated on bootstrap vs original sample) which can be subtracted from 'apparent' performance calculated using the original model on the original data.
corrected
- 'bias corrected' scores (apparent - optimism)
simple
- if method = "boot", estimates of scores derived from the 'simple bootstrap'. This is the average of each score calculated from the bootstrap models evaluated on the original outcome data. NULL if method = ".632"
stability
- if method = "boot", a N,(B+1) matrix where N is the number of observations in data
and B
is the number of bootstrap samples. The first column contains the original predictions and each of subsequent B columns contain the predicted probabilities of the outcome from each bootstrap model evaluated on the original data. There may be fewer than B+1 columns if errors occur during resamples (when model_fun throws an error all scores are NA). NULL if method = ".632"
Steyerberg, E. W., Harrell Jr, F. E., Borsboom, G. J., Eijkemans, M. J. C., Vergouwe, Y., & Habbema, J. D. F. (2001). Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. Journal of clinical epidemiology, 54(8), 774-781.
Harrell Jr F. E. (2015). Regression Modeling Strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. New York: Springer Science, LLC.
library(pminternal)
set.seed(456)
# simulate data with two predictors that interact
dat <- pmcalibration::sim_dat(N = 1000, a1 = -2, a3 = -.3)
mean(dat$y)
dat$LP <- NULL # remove linear predictor
# fit a (misspecified) logistic regression model
model_fun <- function(data, ...){
glm(y ~ x1 + x2, data=data, family="binomial")
}
pred_fun <- function(model, data, ...){
predict(model, newdata=data, type="response")
}
boot_optimism(data=dat, outcome="y", model_fun=model_fun, pred_fun=pred_fun,
method="boot", B=20) # B set to 20 for example but should be >= 200
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.