thresholder2: Choose a Probability Threshold

View source: R/thresholder2.R

thresholder2R Documentation

Choose a Probability Threshold

Description

This function apply thresholder and return the optimal thresholds based on multiple methods (check details).

Usage

thresholder2(
  model,
  thr.method = "all",
  thr.interval = 0.01,
  final = TRUE,
  add.statistics = "J",
  obs.prev = NULL,
  FPC = 1,
  FNC = 1,
  req.sens = 0.85,
  req.spec = 0.85
)

## S3 method for class 'thresholder2'
plot(
  x,
  select.thr = "all",
  statistics = "Sensitivity+Specificity",
  display.final = TRUE,
  add.text = TRUE,
  plot = TRUE,
  ...
)

## S3 method for class 'thresholder2'
summary(
  object,
  which.statistics = "J",
  which.method = NULL,
  maximize = ifelse(which.statistics == "Dist", FALSE, TRUE),
  ...
)

## S3 method for class 'thresholder2'
as.data.frame(x, ...)

Arguments

model

A model returned by train.

thr.method

A vector with threshold methods to use when calculating the optimal threshold. Can be a character or numeric vector. Check details for possible values. Default "all" is to use all methods.

thr.interval

A value to create probability thresholds cutoffs intervals to be evaluated. Should be between 0 and 0.1.

final

logical. Should only the final tuning parameters chosen by train be used?

add.statistics

A character vector indicating additional statistics to calculate. See thresholder for a list of possible values. Note that 'Sensitivity', 'Specificity', 'Kappa', 'Dist' and 'Detection Prevalence' are always calculated.

obs.prev

Observed prevalence, in case your data is taken from a larger dataset. Defaults to observed prevalence from training data.

FPC, FNC

False Positive and False Negative Costs. Used only to calculate "Cost" threshold.

req.sens, req.spec

Required Sensitivity and Specificity to calculate "ReqSens" and "ReqSpec" thresholds.

x, object

An object returned by thresholder2

select.thr

A vector with threshold methods to be displayed. Can be index of thr.method used in thresholder2.

statistics

A character vector indicating statistics to be displayed. Can be index of statistics used in thresholder2. Alternatively, can be "all" to display all available statistics or Sensitivity+Specificity" to display a Sensitivity and Specificity in a single plot.

display.final

logical. Should display only thresholds for the final model tuned by train? This is only used if you set final = FALSE in thresholder2.

add.text

logical. Plot threshold methods names? They are displayed only if display.final = TRUE.

plot

logical. Return the plot or not?

...

ignored

which.statistics

A single character with a statistic to be used to select the best threshold. Only used if which.method is NULL.

which.method

A single character with a threshold method to obtain the threshold.

maximize

A logical: should the statistic be maximized or minimized?

Details

Possible thr.method are:

  • Default - The default for most models, this set the threshold to 0.5

  • Min_Presence - Required threshold to have Sensitivity = 1

  • 10

  • Sens=Spec - Optimize threshold where sensitivity equals specificity

  • MaxSens+Spec - Optimize threshold which maximaze sensitivity plus specificity

  • MaxKappa - Optimize threshold which maximaze kappa

  • PredPrev=Obs - Optimize threshold where Observed prevalence equals specificity

  • Dist - Optimize threshold which minimize Dist (check thresholder)

  • Cost - Optimize threshold which maximize Cost (check PresenceAbsence::optimal.thresholds)

  • ReqSens - Required threshold to have Sensitivity = req.sens

  • ReqSpec - Required threshold to have Specificity = req.spec

You can check the help documentation of PresenceAbsence::optimal.thresholds or caret::thresholder2 for more details. Most names of threshold methods are the same for optimal.thresholds.

Value

An S3 object of class 'thresholder2', including:

  • thrs - A data.table with thresholds methods and their statistics.

  • thresholder - The output of thresholder.

For the method plot, if plot = TRUE, a ggplot is returned. If plot = FALSE, a list of data.tables in the long format (used by ggplot) is returned.

Note

By default, Youden's J statistic is also returned, which is the same as TSS (True Skill Statistic).

See Also

thresholder

Examples

## Not run: 
# Select a threshold looking at all tuning parameters
t.obj <- thresholder2(model, thr.method= 1:6, final = FALSE)
plot(t.obj, select.thr="all", statistics="all", display.final = FALSE)
summary(t.obj)

# Select a threshold looking only at the final model
t.obj <- thresholder2(model, thr.interval = 0.005, final = TRUE)
plot(t.obj, select.thr=3:7)
thr <- summary(t.obj)
thr # this will hold the best threshold based on which.statistics

as.data.frame(t.rf) # data.frame with thresholds and statistics for each method

## End(Not run)

correapvf/caretSDM documentation built on June 2, 2022, 8:29 a.m.