plot_roi: ROI plot
In modelplotr: Plots to Evaluate the Business Performance of Predictive Models

Description Usage Arguments Value See Also Examples

Generates the Return on Investment plot. It plots the cumulative revenues as a percentage of investments up until that ntile when the model is used for campaign selection. It can be used to answer the following business question: When we apply the model and select up until ntile X, what is the expected return on investment of the campaign? Extra parameters needed for this plot are: fixed_costs, variable_costs_per_unit and profit_per_unit.

plot_roi(
  data = plot_input,
  highlight_ntile = "max_roi",
  highlight_how = "plot_text",
  save_fig = FALSE,
  save_fig_filename = NA,
  custom_line_colors = NA,
  custom_plot_text = NULL,
  fixed_costs,
  variable_costs_per_unit,
  profit_per_unit
)

`data`	Dataframe. Dataframe needs to be created with `plotting_scope` or else meet required input format.
`highlight_ntile`	Integer or string ("max_roi" or "max_profit"). Specifying the ntile at which the plot is annotated and/or performances are highlighted. Default value is `max_roi`, highlighting the ntile where roi is highest.
`highlight_how`	String. How to annotate the plot. Possible values: "plot_text","plot", "text". Default is "plot_text", both highlighting the ntile and value on the plot as well as in text below the plot. "plot" only highligths the plot, but does not add text below the plot explaining the plot at chosen ntile. "text" adds text below the plot explaining the plot at chosen ntile but does not highlight the plot.
`save_fig`	Logical. Save plot to file? Default = FALSE. When set to TRUE, saved plot is optimized for 36x24cm.
`save_fig_filename`	String. Filename of saved plot. Default the plot is saved as tempdir()/plotname.png.
`custom_line_colors`	Vector of Strings. Specifying colors for the lines in the plot. When not specified, colors from the RColorBrewer palet "Set1" are used.
`custom_plot_text`	List. List with customized textual elements for plot. Create a list with defaults by using `customize_plot_text` and override default values to customize.
`fixed_costs`	Numeric. Specifying the fixed costs related to a selection based on the model. These costs are constant and do not vary with selection size (ntiles).
`variable_costs_per_unit`	Numeric. Specifying the variable costs per selected unit for a selection based on the model. These costs vary with selection size (ntiles).
`profit_per_unit`	Numeric. Specifying the profit per unit in case the selected unit converts / responds positively.

gtable, containing 6 grobs. # load example data (Bank clients with/without a term deposit - see ?bank_td for details)

modelplotr for generic info on the package moddelplotr

vignette('modelplotr')

plotting_scope for details on the function plotting_scope that transforms a dataframe created with prepare_scores_and_ntiles or aggregate_over_ntiles to a dataframe in the required format for all modelplotr plots.

aggregate_over_ntiles for details on the function aggregate_over_ntiles that aggregates the output of prepare_scores_and_ntiles to create a dataframe with aggregated actuals and predictions. In most cases, you do not need to use it since the plotting_scope function will call this function automatically.

https://github.com/modelplot/modelplotr for details on the package

https://modelplot.github.io/ for our blog on the value of the model plots

# load example data (Bank clients with/without a term deposit - see ?bank_td for details)
data("bank_td")
# prepare data for training model for binomial target has_td and train models
train_index =  sample(seq(1, nrow(bank_td)),size = 0.5*nrow(bank_td) ,replace = FALSE)
train = bank_td[train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
test = bank_td[-train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
#train models using caret... (or use mlr or H2o or keras ... see ?prepare_scores_and_ntiles)
# setting caret cross validation, here tuned for speed (not accuracy!)
fitControl <- caret::trainControl(method = "cv",number = 2,classProbs=TRUE)
# random forest using ranger package, here tuned for speed (not accuracy!)
rf = caret::train(has_td ~.,data = train, method = "ranger",trControl = fitControl,
                  tuneGrid = expand.grid(.mtry = 2,.splitrule = "gini",.min.node.size=10))
# mnl model using glmnet package
mnl = caret::train(has_td ~.,data = train, method = "glmnet",trControl = fitControl)
# load modelplotr
library(modelplotr)
# transform datasets and model objects to input for modelplotr
scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("train","test"),
                         dataset_labels = list("train data","test data"),
                         models = list("rf","mnl"),
                         model_labels = list("random forest","multinomial logit"),
                         target_column="has_td",
                         ntiles=100)
# set scope for analysis (default: no comparison)
plot_input <- plotting_scope(prepared_input = scores_and_ntiles)
plot_roi(data=plot_input,fixed_costs=1000,variable_costs_per_unit= 10,profit_per_unit=50)
plot_roi(data=plot_input,fixed_costs=1000,variable_costs_per_unit= 10,profit_per_unit=50,
         highlight_ntile=20)
plot_roi(data=plot_input,fixed_costs=1000,variable_costs_per_unit= 10,profit_per_unit=50,
         highlight_ntile="max_profit")

Package modelplotr loaded! Happy model plotting!
Loading required package: lattice
Loading required package: ggplot2
... scoring caret model "rf" on dataset "train".
... scoring caret model "mnl" on dataset "train".
... scoring caret model "rf" on dataset "test".
... scoring caret model "mnl" on dataset "test".
Data preparation step 1 succeeded! Dataframe created.
Warning message:
`select_()` is deprecated as of dplyr 0.7.0.
Please use `select()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
Data preparation step 2 succeeded! Dataframe created.
"prepared_input" aggregated...

Data preparation step 3 succeeded! Dataframe created.

No comparison specified, default values are used. 

Single evaluation line will be plotted: Target value "term.deposit" plotted for dataset "test data" and model "multinomial logit.
"
-> To compare models, specify: scope = "compare_models"
-> To compare datasets, specify: scope = "compare_datasets"
-> To compare target classes, specify: scope = "compare_targetclasses"
-> To plot one line, do not specify scope or specify scope = "no_comparison".


Warning message:
`group_by_()` is deprecated as of dplyr 0.7.0.
Please use `group_by()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
 
Plot annotation for plot: Return on Investment (ROI)
- When we select ntiles 1 until 15 in dataset test data using model multinomial logit to target term.deposit cases the expected return on investment is 56%. 
 
 
Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2. 
 
Plot annotation for plot: Return on Investment (ROI)
- When we select ntiles 1 until 20 in dataset test data using model multinomial logit to target term.deposit cases the expected return on investment is 47%. 
 
 
Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2. 
 
Plot annotation for plot: Return on Investment (ROI)
- When we select ntiles 1 until 22 in dataset test data using model multinomial logit to target term.deposit cases the expected return on investment is 47%. 
 
 
Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2.