plot_cumgains: Cumulative gains plot

Description Usage Arguments Value See Also Examples

View source: R/plottingmodelplots.R

Description

Generates the cumulative gains plot. This plot, often referred to as the gains chart, helps answering the question: When we apply the model and select the best X ntiles, what percentage of the actual target class observations can we expect to target?

Usage

1
2
3
4
5
6
7
8
9
plot_cumgains(
  data = plot_input,
  highlight_ntile = NA,
  highlight_how = "plot_text",
  save_fig = FALSE,
  save_fig_filename = NA,
  custom_line_colors = NA,
  custom_plot_text = NULL
)

Arguments

data

Dataframe. Dataframe needs to be created with plotting_scope or else meet required input format.

highlight_ntile

Integer. Specifying the ntile at which the plot is annotated and/or performances are highlighted.

highlight_how

String. How to annotate the plot. Possible values: "plot_text","plot", "text". Default is "plot_text", both highlighting the ntile and value on the plot as well as in text below the plot. "plot" only highligths the plot, but does not add text below the plot explaining the plot at chosen ntile. "text" adds text below the plot explaining the plot at chosen ntile but does not highlight the plot.

save_fig

Logical. Save plot to file? Default = FALSE. When set to TRUE, saved plots are optimized for 18x12cm.

save_fig_filename

String. Filename of saved plot. Default the plot is saved as tempdir()/plotname.png.

custom_line_colors

Vector of Strings. Specifying colors for the lines in the plot. When not specified, colors from the RColorBrewer palet "Set1" are used.

custom_plot_text

List. List with customized textual elements for plot. Create a list with defaults by using customize_plot_text and override default values to customize.

Value

ggplot object. Cumulative gains plot.

See Also

modelplotr for generic info on the package moddelplotr

vignette('modelplotr')

plotting_scope for details on the function plotting_scope that transforms a dataframe created with prepare_scores_and_ntiles or aggregate_over_ntiles to a dataframe in the required format for all modelplotr plots.

aggregate_over_ntiles for details on the function aggregate_over_ntiles that aggregates the output of prepare_scores_and_ntiles to create a dataframe with aggregated actuals and predictions. In most cases, you do not need to use it since the plotting_scope function will call this function automatically.

https://github.com/modelplot/modelplotr for details on the package

https://modelplot.github.io/ for our blog on the value of the model plots

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# load example data (Bank clients with/without a term deposit - see ?bank_td for details)
data("bank_td")
# prepare data for training model for binomial target has_td and train models
train_index =  sample(seq(1, nrow(bank_td)),size = 0.5*nrow(bank_td) ,replace = FALSE)
train = bank_td[train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
test = bank_td[-train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
#train models using caret... (or use mlr or H2o or keras ... see ?prepare_scores_and_ntiles)
# setting caret cross validation, here tuned for speed (not accuracy!)
fitControl <- caret::trainControl(method = "cv",number = 2,classProbs=TRUE)
# random forest using ranger package, here tuned for speed (not accuracy!)
rf = caret::train(has_td ~.,data = train, method = "ranger",trControl = fitControl,
                  tuneGrid = expand.grid(.mtry = 2,.splitrule = "gini",.min.node.size=10))
# mnl model using glmnet package
mnl = caret::train(has_td ~.,data = train, method = "glmnet",trControl = fitControl)
# load modelplotr
library(modelplotr)
# transform datasets and model objects to input for modelplotr
scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("train","test"),
                         dataset_labels = list("train data","test data"),
                         models = list("rf","mnl"),
                         model_labels = list("random forest","multinomial logit"),
                         target_column="has_td",
                         ntiles=100)
plot_input <- plotting_scope(prepared_input = scores_and_ntiles,scope="compare_models")
plot_cumgains(data=plot_input)
plot_cumgains(data=plot_input,custom_line_colors=c("orange","purple"))
plot_cumgains(data=plot_input,highlight_ntile=20)

modelplotr documentation built on Oct. 23, 2020, 8:20 p.m.