error.threshold.plot: Error Threshold Plot

View source: R/error.threshold.plot.R

error.threshold.plotR Documentation

Error Threshold Plot

Description

error.threshold.plot takes a single model and plots the sensitivity and specificity as a function of threshold. It will optionally add other error statistics such as PCC and/or Kappa to the plot. Optionally, it will also optimize the choice of threshold by several criteria, return the results as a dataframe, and mark the optimized thresholds on the plot.

Usage

error.threshold.plot(DATA, threshold = 101, which.model = 1, na.rm = FALSE, 
xlab = "Threshold", ylab = "Accuracy Measures", main = NULL, model.names = NULL, 
color = NULL, line.type = NULL, lwd = 1, plot.it = TRUE, opt.thresholds = NULL, 
opt.methods = NULL, req.sens, req.spec, obs.prev = NULL, smoothing = 1, 
vert.lines = FALSE, add.legend = TRUE, legend.text = legend.names, 
legend.cex = 0.8, add.opt.legend = TRUE, opt.legend.text = NULL, 
opt.legend.cex = 0.7, pch = NULL, FPC, FNC)

Arguments

DATA

a matrix or dataframe of observed and predicted values where each row represents one plot and where columns are:

DATA[,1] plot ID text
DATA[,2] observed values zero-one values
DATA[,3] predicted probabilities from first model numeric (between 0 and 1)
DATA[,4] predicted probabilities from second model, etc...
threshold

cutoff values between zero and one used for translating predicted probabilities into 0 /1 values, defaults to 0.5. It can be a single value between zero and one, a vector of values between zero and one, or a positive integer representing the number of evenly spaced thresholds to calculate.

which.model

a number indicating which model from DATA should be used

na.rm

a logical indicating whether missing values should be removed

xlab

a title for the x axis

ylab

a title for the y axis

main

an overall title for the plot

model.names

a vector of the names of each model included in DATA to be used in the legend box

color

should each error statistic be plotted in a different color. It can be a logical value (where TRUE = color and FALSE = black and white), or a vector of color codes specifying particular colors for each line.

line.type

should each model be plotted in a different line type. It can be a logical value (where TRUE = dashed lines and FALSE = solid lines), or a vector of codes specifying particular line types for each line.

lwd

line width

plot.it

a logical indicating if a graphical plot should be produced

opt.thresholds

logical indicating whether the optimal thresholds should be calculated and plotted, or a vector specifying thresholds to plot

opt.methods

what methods should be used to optimize thresholds. Given either as a vector of method names or method numbers. Possible values are:

1 Default threshold=0.5
2 Sens=Spec sensitivity=specificity
3 MaxSens+Spec maximizes (sensitivity+specificity)/2
4 MaxKappa maximizes Kappa
5 MaxPCC maximizes PCC (percent correctly classified)
6 PredPrev=Obs predicted prevalence=observed prevalence
7 ObsPrev threshold=observed prevalence
8 MeanProb mean predicted probability
9 MinROCdist minimizes distance between ROC plot and (0,1)
10 ReqSens user defined required sensitivity
11 ReqSpec user defined required specificity
12 Cost user defined relative costs ratio
req.sens

a value between zero and one giving the user defined required sensitivity. Only used if opt.thresholds = TRUE. Note that req.sens = (1-maximum allowable errors for points with positive observations).

req.spec

a value between zero and one giving the user defined required sspecificity. Only used if opt.thresholds = TRUE. Note that req.sens = (1- maximum allowable errors for points with negative observations).

obs.prev

observed prevalence for opt.method = "PredPrev=Obs" and "ObsPrev". Defaults to observed prevalence from DATA.

smoothing

smoothing factor for maximizing/minimizing. Only used if opt.thresholds = TRUE. Instead of find the threshold that gives the max/min value, function will average the thresholds of the given number of max/min values.

vert.lines

a logical where: TRUE means vertical lines added to plot at optimal thresholds; FALSE means no vertical lines, instead optimal thresholds marked along error statistics plots. Only used if opt.thresholds = TRUE.

add.legend

logical indicating if a legend for accuracy statistics should be included on the plot

legend.text

a vector of text for accuracy statistics legend. Defaults to name of each accuracy statistic.

legend.cex

cex for presence/absence legend

add.opt.legend

logical indicating if a legend for optimal threshold criteria should be included on the plot

opt.legend.text

a vector of text for optimimal threshold criteria legend. Defaults to text corresponding to 'opt.methods'.

opt.legend.cex

cex for optimization criteria legend

pch

plotting "character", i.e., symbol to use for the thresholds specified in MARK. pch can either be a single character or an integer code for one of a set of graphics symbols. See help(points) for details.

FPC

False Positive Costs, or for C/B ratio C = 'net costs of treating nondiseased individuals'.

FNC

False Negative Costs, or for C/B ratio B = 'net benefits of treating diseased individuals'.

Details

error.threshold.plot serves two purposes. First, if plot.it = TRUE, it produces a graphical plot. Second, if opt.thresholds = TRUE it will find optimal thresholds by several criteria. These optimal thresholds, along with basic accuracy measures for each type of optimal threshold will be returned as a dataframe. If a plot is produced, these optimal thresholds will be added to the plot.

The graphical plot will always include lines showing sensitivity and specificity as a function of threshold. In addition, for opt.methods = "MaxKappa", "MaxPCC", "MinROCdist", or "MaxSens+Spec" additional lines will be added to show the statistic being maximized/minimized.

These lines will be added to graph even if opt.thresholds = FALSE. So for example, to produce a graph showing sensitivity, specificity, and Kappa as functions of threshold, with out marking the optimal thresholds, set opt.thresholds = FALSE, and opt.methods = "MaxKappa".

See optimal.thresholds for more details on the optimization methods, and on the arguments ReqSens, ReqSpec, obs.prev, smoothing, FPC, and FNC.

When opt.thresholds = TRUE, the default is to plot the optimal thresholds directly along the corresponding error statistics (or along the sensitivity line if the method has no corresponding error statistic). If the argument vert.lines = TRUE, a vertical line is drawn at each optimal threshold, and the lines are labeled across the top of the plot.

Note: if too many methods are included in opt.methods, the graph will get very crowded.

Value

If plot.it = TRUE creates a graphical plot.

If opt.thresholds = TRUE, returns a dataframe of information about the optimal thresholds where:

[,1] legend.names type of optimal threshold
[,2] threshold optimal threshold
[,3] PCC at that threshold
[,4] sensitivity at that threshold
[,5] specificity at that threshold
[,6] Kappa at that threshold

Author(s)

Elizabeth Freeman eafreeman@fs.fed.us

See Also

optimal.thresholds, presence.absence.accuracy, roc.plot.calculate, presence.absence.summary

Examples


data(SIM3DATA)

error.threshold.plot(SIM3DATA,opt.methods=c(1,2,5))

error.threshold.plot( SIM3DATA, 
                      which.model=2, 
                      opt.thresholds=TRUE, 
		      opt.methods=c("Default", "Sens=Spec", "MinROCdist"), 
                      vert.lines=TRUE)


error.threshold.plot(	SIM3DATA,
				threshold=101,
				which.model=2,
				na.rm=TRUE,
				xlab="Threshold",
				ylab="Accuracy Measures",
				main="Error Rate verses Threshold",
				model.names=NULL, 
				pch=NULL,
				color= c(3,5,7),
				line.type=NULL,
				lwd=1,
				plot.it=TRUE,
				opt.thresholds=TRUE,
				opt.methods=1:4,
				req.sens=0.85,
				req.spec=0.85,
				obs.prev=NULL,
				smoothing=1,
				vert.lines=FALSE,
				add.legend=TRUE,
				legend.cex=0.8)



PresenceAbsence documentation built on Jan. 7, 2023, 9:09 a.m.