Description Usage Arguments Value When you build scores_and_ntiles yourself See Also Examples
View source: R/dataprepmodelplots.R
Build dataframe object that contains actuals and predictions on
the target variable for each dataset in datasets
and each model in models
1 2 3 4 5 6 7 8 | prepare_scores_and_ntiles(
datasets,
dataset_labels,
models,
model_labels,
target_column,
ntiles = 10
)
|
datasets |
List of Strings. A list of the names of the dataframe objects to include in model evaluation. All dataframes need to contain a target variable and feature variables. |
dataset_labels |
List of Strings. A list of labels for the datasets, shown in plots.
When dataset_labels is not specified, the names from |
models |
List of Strings. List of the names of the model objects, containing parameters to apply models to datasets. To use this function, model objects need to be generated by the mlr package or the caret package or the h20 package or the keras package. Modelplotr automatically detects whether the model is built using mlr or caret or h2o or keras. |
model_labels |
List of Strings. Labels for the models, shown in plots.
When model_labels is not specified, the names from |
target_column |
String. Name of the target variable in datasets. Target can be either binary or multinomial. Continuous targets are not supported. |
ntiles |
Integer. Number of ntiles. The ntile parameter represents the specified number of equally sized buckets the observations in each dataset are grouped into. By default, observations are grouped in 10 equally sized buckets, often referred to as deciles. |
Dataframe. A dataframe is built, based on the datasets
and models
specified. It contains the dataset name, actuals on the target_column
,
the predicted probabilities for each target class (eg. unique target value) and attribution to
ntiles in the dataset for each target class.
To make plots with modelplotr, is not required to use this function to generate input for function plotting_scope
You can create your own dataframe containing actuals and predictions and ntiles,
See build_input_yourself
for an example to build the required input for plotting_scope
or aggregate_over_ntiles
yourself, within r or even outside of r.
modelplotr
for generic info on the package moddelplotr
plotting_scope
for details on the function plotting_scope
that
transforms a dataframe created with prepare_scores_and_ntiles
or aggregate_over_ntiles
to
a dataframe in the required format for all modelplotr plots.
aggregate_over_ntiles
for details on the function aggregate_over_ntiles
that
aggregates the output of prepare_scores_and_ntiles
to create a dataframe with aggregated actuals and predictions.
In most cases, you do not need to use it since the plotting_scope
function will call this function automatically.
https://github.com/modelplot/modelplotr for details on the package
https://modelplot.github.io/ for our blog on the value of the model plots
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ## Not run:
# load example data (Bank clients with/without a term deposit - see ?bank_td for details)
data("bank_td")
# prepare data for training model for binomial target has_td and train models
train_index = sample(seq(1, nrow(bank_td)),size = 0.5*nrow(bank_td) ,replace = FALSE)
train = bank_td[train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
test = bank_td[-train_index,c('has_td','duration','campaign','pdays','previous','euribor3m')]
#train models using mlr...
trainTask <- mlr::makeClassifTask(data = train, target = "has_td")
testTask <- mlr::makeClassifTask(data = test, target = "has_td")
mlr::configureMlr() # this line is needed when using mlr without loading it (mlr::)
task = mlr::makeClassifTask(data = train, target = "has_td")
lrn = mlr::makeLearner("classif.randomForest", predict.type = "prob")
rf = mlr::train(lrn, task)
lrn = mlr::makeLearner("classif.multinom", predict.type = "prob")
mnl = mlr::train(lrn, task)
#... or train models using caret...
# setting caret cross validation, here tuned for speed (not accuracy!)
fitControl <- caret::trainControl(method = "cv",number = 2,classProbs=TRUE)
# random forest using ranger package, here tuned for speed (not accuracy!)
rf = caret::train(has_td ~.,data = train, method = "ranger",trControl = fitControl,
tuneGrid = expand.grid(.mtry = 2,.splitrule = "gini",.min.node.size=10))
# mnl model using glmnet package
mnl = caret::train(has_td ~.,data = train, method = "glmnet",trControl = fitControl)
#... or train models using h2o...
h2o::h2o.init()
h2o::h2o.no_progress()
h2o_train = h2o::as.h2o(train)
h2o_test = h2o::as.h2o(test)
gbm <- h2o::h2o.gbm(y = "has_td",
x = setdiff(colnames(train), "has_td"),
training_frame = h2o_train,
nfolds = 5)
#... or train models using keras.
x_train <- as.matrix(train[,-1]); y=train[,1]; y_train <- keras::to_categorical(as.numeric(y)-1);
`%>%` <- magrittr::`%>%`
nn <- keras::keras_model_sequential() %>%
keras::layer_dense(units = 16,kernel_initializer = "uniform",activation = 'relu',
input_shape = NCOL(x_train))%>%
keras::layer_dense(units = 16,kernel_initializer = "uniform", activation='relu') %>%
keras::layer_dense(units = length(levels(train[,1])),activation='softmax')
nn %>% keras::compile(optimizer='rmsprop',loss='categorical_crossentropy',metrics=c('accuracy'))
nn %>% keras::fit(x_train,y_train,epochs = 20,batch_size = 1028,verbose=0)
# preparation steps
scores_and_ntiles <- prepare_scores_and_ntiles(datasets=list("train","test"),
dataset_labels = list("train data","test data"),
models = list("rf","mnl", "gbm","nn"),
model_labels = list("random forest","multinomial logit",
"gradient boosting machine","artificial neural network"),
target_column="has_td")
plot_input <- plotting_scope(prepared_input = scores_and_ntiles)
plot_cumgains(data = plot_input)
plot_cumlift(data = plot_input)
plot_response(data = plot_input)
plot_cumresponse(data = plot_input)
plot_multiplot(data = plot_input)
plot_costsrevs(data=plot_input,fixed_costs=1000,variable_costs_per_unit=10,profit_per_unit=50)
plot_profit(data=plot_input,fixed_costs=1000,variable_costs_per_unit=10,profit_per_unit=50)
plot_roi(data=plot_input,fixed_costs=1000,variable_costs_per_unit=10,profit_per_unit=50)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.