Description When you build input for plotting_scope() yourself Examples
It's very easy to apply modelplotr to predictive models that are developed in caret, mlr, h2o or keras. However, also for models that are developed differently, even those built outside of R, it only takes a bit more work to use modelplotr on top of these models. In this section we introduce the required format and an example.
To make plots with modelplotr, is not required to use the function prepare_scores_and_ntiles to generate the required input data. You can create your own dataframe containing actuals and probabilities and ntiles (1st ntile = (1/#ntiles) percent with highest model probability, last ntile = (1/#ntiles) percent with lowest probability according to model) , In that case, make sure the input dataframe contains the folowing columns & formats:
column | type | definition |
model_label | Factor | Name of the model object |
dataset_label | Factor | Datasets to include in the plot as factor levels |
y_true | Factor | Target with actual values |
prob_[tv1] | Decimal | Probability according to model for target value 1 |
prob_[tv2] | Decimal | Probability according to model for target value 2 |
... | ... | ... |
prob_[tvn] | Decimal | Probability according to model for target value n |
ntl_[tv1] | Integer | Ntile based on probability according to model for target value 1 |
ntl_[tv2] | Integerl | Ntile based on probability according to model for target value 2 |
... | ... | ... |
ntl_[tvn] | Integer | Ntile based on probability according to model for target value n |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | # load example data (Bank clients with/without a term deposit - see ?bank_td for details)
data("bank_td")
library(dplyr)
# prepare data for training model for binomial target has_td and train models
train_index = sample(seq(1, nrow(bank_td)),size = 0.5*nrow(bank_td) ,replace = FALSE)
train = bank_td[train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
test = bank_td[-train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
#train logistic regression model with stats package
glm.model <- glm(has_td ~.,family=binomial(link='logit'),data=train)
#score model
prob_no.term.deposit <- stats::predict(glm.model,newdata=train,type='response')
prob_term.deposit <- 1-prob_no.term.deposit
#set number of ntiles
ntiles = 10
# determine cutoffs
cutoffs = c(stats::quantile(prob_term.deposit,probs = seq(0,1,1/ntiles),na.rm = TRUE))
#calculate ntile values
ntl_term.deposit <- (ntiles+1)-as.numeric(cut(prob_term.deposit,breaks=cutoffs,include.lowest=TRUE))
ntl_no.term.deposit <- (ntiles+1)-ntl_term.deposit
# create scored data frame
scores_and_ntiles <- train %>%
select(has_td) %>%
mutate(model_label=factor('logistic regression'),
dataset_label=factor('train data'),
y_true=factor(has_td),
prob_term.deposit = prob_term.deposit,
prob_no.term.deposit = prob_no.term.deposit,
ntl_term.deposit = ntl_term.deposit,
ntl_no.term.deposit = ntl_no.term.deposit) %>%
select(-has_td)
# add test data
#score model on test data
prob_no.term.deposit <- stats::predict(glm.model,newdata=test,type='response')
prob_term.deposit <- 1-prob_no.term.deposit
#set number of ntiles
ntiles = 10
# determine cutoffs
cutoffs = c(stats::quantile(prob_term.deposit,probs = seq(0,1,1/ntiles),na.rm = TRUE))
#calculate ntile values
ntl_term.deposit <- (ntiles+1)-as.numeric(cut(prob_term.deposit,breaks=cutoffs,include.lowest=TRUE))
ntl_no.term.deposit <- (ntiles+1)-ntl_term.deposit
scores_and_ntiles <- scores_and_ntiles %>%
rbind(
test %>%
select(has_td) %>%
mutate(model_label=factor('logistic regression'),
dataset_label=factor('test data'),
y_true=factor(has_td),
prob_term.deposit = prob_term.deposit,
prob_no.term.deposit = prob_no.term.deposit,
ntl_term.deposit = ntl_term.deposit,
ntl_no.term.deposit = ntl_no.term.deposit) %>%
select(-has_td)
)
plot_input <- plotting_scope(prepared_input = scores_and_ntiles,scope='compare_datasets')
plot_cumgains()
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.