library(CDIPATools)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Read in Sample Dataset

This dataset is contained in the data file of this package. This dataset has been made to simulate what predictions would look like from a model, meaning it contains a column that has probability scores, and also a column that has a prediction (0/1) for binary classification, and also has a column that contains a true outcome that is binary

#dat <- load(file = "../data/FakePredictionResults.rda")
fake_predictions_data <- FakePredictionResults
head(fake_predictions_data)
dim(fake_predictions_data)

We will use the "true.risk.bin" column as true outcomes, and the column "est.risk.score" as the predicted probability by a model

Now we will walk through and show each function in the "~/R/utils/metrics.R" file works

confusion_matrix

We will start by showing how the get confusion metric function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

confusion_matrix(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

fpr

We will show how the get false positive rate function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

fpr(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

fnr

We will show how the get false negative rate function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

fnr(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

tpr

We will show how the get true positive rate function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

tpr(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

tnr

We will show how the get true negative rate function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

tnr(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

auc_roc

We will show how the get auc roc function works.

auc_roc(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin)

auc_pr

We will show how the get auc precision recall curve function works.

auc_pr(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin)

precision

We will show how the get precision function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

precision(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

recall

We will show how the get recall function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

recall(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

accuracy

We will show how the get precision function works. We will show that by using a cut-off threshold of .5 for the prediction probabilities

accuracy(predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin, threshold = .5)

threshold_finder

The threshold_finder function finds the optimal threshold cut-off for predictions based on a passed in metric function. All the passed in metric functions need to take the parameters predictions,outcomes,and threshold. We will show how to find the optimal threshold to maximize precision

#threshold_finder(precision, predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin)

We will show how to find the optimal threshold to maximize recall

#threshold_finder(recall, predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin)

We will show how to find the optimal threshold to maximize accuracy

#threshold_finder(accuracy, predictions = fake_predictions_data$est.risk.score, outcomes = fake_predictions_data$true.risk.bin)


ksboxer/CDIPATools documentation built on June 5, 2019, 8:29 a.m.