View source: R/pred_val_probs.R
pred_val_probs | R Documentation |
This function is included for situations where one has a vector of predicted
probabilities from a model and a vector of observed binary outcomes that we
wish to validate the predictions against. See pred_validate
for
the main validation function of this package.
pred_val_probs(binary_outcome, Prob, cal_plot = TRUE, level = 0.95, ...)
binary_outcome |
vector of binary outcomes (coded as 1 if outcome happened, and 0 otherwise). Must be of same length as Prob |
Prob |
vector of predicted probabilities. Must be of same length of binary_outcome. |
cal_plot |
indicate if a flexible calibration plot should be produced (TRUE) or not (FALSE). |
level |
the confidence level required for all performance metrics. Defaults at 95%. Must be a value between 0 and 1. |
... |
further plotting arguments for the calibration plot. See Details below. |
This function takes a vector of observed binary outcomes, and a
corresponding vector of predicted risks (e.g. from a logistic regression
CPM), and calculates measures of predictive performance. The function is
intended as a standalone way of validating predicted risks against binary
outcomes outside of the usual pred_input_info() -> pred_validate() package
workflow. See pred_validate
for the main validation function
of this package.
Various metrics of calibration (agreement between the observed risk and the predicted risks, across the full risk range) and discrimination (ability of the model to distinguish between those who develop the outcome and those who do not) are calculated. For calibration, the observed-to-expected ratio, calibration intercept and calibration slopes are estimated. The calibration intercept is estimated by fitting a logistic regression model to the observed binary outcomes, with the linear predictor of the model as an offset. For calibration slope, a logistic regression model is fit to the observed binary outcome with the linear predictor from the model as the only covariate. For discrimination, the function estimates the area under the receiver operating characteristic curve (AUC). Various other metrics are also calculated to assess overall accuracy (Brier score, Cox-Snell R2).
A flexible calibration plot is produced. Specify parameter
cal_plot
to indicate whether a calibration plot should be produced
(TRUE), or not (FALSE). See pred_validate
for details on
this plot, and details of optional plotting arguments.
An object of class "predvalidate
", which is a list containing
relevant calibration and discrimination measures. See
pred_validate
for details.
pred_input_info
, pred_validate
# simulate some data for purposes of example illustration
set.seed(1234)
x1 <- rnorm(2000)
LP <- -2 + (0.5*x1)
PR <- 1/(1+exp(-LP))
y <- rbinom(2000, 1, PR)
#fit hypothetical model to the simulated data
mod <- glm(y[1:1000] ~ x1[1:1000], family = binomial(link="logit"))
#obtain the predicted risks from the model
pred_risk <- predict(mod, type = "response",
newdata = data.frame("x1" = x1[1001:2000]))
#Use pred_val_probs to validate the predicted risks against the
#observed outcomes
summary(pred_val_probs(binary_outcome = y[1001:2000],
Prob = pred_risk,
cal_plot = FALSE))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.