ThresholdPlot: Plot classifier metrics as a function of thresholds.

Description Usage Arguments Details See Also Examples

View source: R/ThresholdPlot.R

Description

Plot classifier metrics as a function of thresholds.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
ThresholdPlot(
  frame,
  xvar,
  truthVar,
  title,
  ...,
  metrics = c("sensitivity", "specificity"),
  truth_target = TRUE,
  points_to_plot = NULL,
  monochrome = TRUE,
  palette = "Dark2",
  linecolor = "black"
)

Arguments

frame

data frame to get values from

xvar

column of scores

truthVar

column of true outcomes

title

title to place on plot

...

no unnamed argument, added to force named binding of later arguments.

metrics

metrics to be computed. See Details for the list of allowed metrics

truth_target

truth value considered to be positive.

points_to_plot

how many data points to use for plotting. Defaults to NULL (all data)

monochrome

logical: if TRUE, all subgraphs plotted in same color

palette

character: if monochrome==FALSE, name of brewer color palette (can be NULL)

linecolor

character: if monochrome==TRUE, name of line color

Details

By default, ThresholdPlot plots sensitivity and specificity of a a classifier as a function of the decision threshold. Plotting sensitivity-specificity (or other metrics) as a function of classifier score helps identify a score threshold that achieves an acceptable tradeoff among desirable properties.

ThresholdPlot can plot a number of metrics. Some of the metrics are redundant, in keeping with the customary terminology of various analysis communities.

For example, plotting sensitivity/false_positive_rate as functions of threshold will "unroll" an ROC Plot.

ThresholdPlot can also plot distribution diagnostics about the scores:

Plots are in a single column, in the order specified by metrics.

points_to_plot specifies the approximate number of datums used to create the plots as an absolute count; for example setting points_to_plot = 200 uses approximately 200 points, rather than the entire data set. This can be useful when visualizing very large data sets.

See Also

PRTPlot

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# data with two different regimes of behavior
d <- rbind(
  data.frame(
    x =  rnorm(1000),
    y = sample(c(TRUE, FALSE), prob = c(0.02, 0.98), size = 1000, replace = TRUE)),
  data.frame(
    x =  rnorm(200) + 5,
    y = sample(c(TRUE, FALSE), size = 200, replace = TRUE))
)

# Sensitivity/Specificity examples
ThresholdPlot(d, 'x', 'y',
   title = 'Sensitivity/Specificity',
   metrics = c('sensitivity', 'specificity'),
   truth_target = TRUE)
MetricPairPlot(d, 'x', 'y',
   x_metric = 'false_positive_rate',
   y_metric = 'true_positive_rate',
   truth_target = TRUE,
   title = 'ROC equivalent')
ROCPlot(d, 'x', 'y',
   truthTarget = TRUE,
   title = 'ROC example')

# Precision/Recall examples
ThresholdPlot(d, 'x', 'y',
   title = 'precision/recall',
   metrics = c('recall', 'precision'),
   truth_target = TRUE)
MetricPairPlot(d, 'x', 'y',
   x_metric = 'recall',
   y_metric = 'precision',
   title = 'recall/precision',
   truth_target = TRUE)
PRPlot(d, 'x', 'y',
   truthTarget = TRUE,
   title = 'p/r plot')

WinVector/WVPlots documentation built on Jan. 12, 2021, 2:38 a.m.