Description Usage Arguments Value See Also Examples
Some models are able to capture relative dependencies. In order to visualise them the dataset is split into three parts. 0-25,25-75,75-100 percentile or the three most common factors.Then variable dependencies for each of the three splits are plotted. In the mtcars example below we can see that the model predicts an increase in disp if drat increases for cars with 8 cylinders, while the opposite is true for cars with only 6 cylinders.
1 2 3 | f_model_plot_var_dep_over_spec_var_range(m, title, variables,
range_variable, data, formula, data_ls, variable_color_code, log_y = F,
limit = 12)
|
m |
a model |
title |
model title |
variables |
character vector with variable names, or ranked variables as returned by f_model_importance() |
range_variable |
character vector denoting range variable |
data |
dataset |
formula |
formula |
data_ls |
data_ls object generated by f_clean_data(), or a named list list( data = <dataframe>, numericals = < vector with column names of numerical columns>) - The data_ls object provides the entire dataset |
variable_color_code |
dataframe created by f_plot_color_code_variables() |
log_y |
boolean log_scale for y axis |
limit |
integer limit the number of variables to be plotted, Default: 12 |
grid can be printed with gridExtra::grid.arrange()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | ## Not run:
# single output example ---------------------------------------
.f = randomForest::randomForest
data_ls = f_clean_data(mtcars)
data = data_ls$data
formula = disp~mpg+cyl+am+hp+drat+qsec+vs+gear+carb
m = .f(formula, data)
variables = f_model_importance( m, data)
title = unlist( stringr::str_split( class(m)[1], '\\.') )[1]
variable_color_code = f_plot_color_code_variables(data_ls)
limit = 10
log_y = F
range_variable_num = data_ls$numericals[1]
range_variable_cat = data_ls$categoricals[1]
grid_num = f_model_plot_var_dep_over_spec_var_range(m
, title
, variables
, range_variable_num
, data
, formula
, data_ls
, variable_color_code
, log_y
, limit )
gridExtra::grid.arrange(grid_num)
# pipe example ------------------------------------------------
data_ls = f_clean_data(mtcars)
form = as.formula('disp~cyl+mpg+hp+am+gear+drat+wt+vs+carb')
variable_color_code = f_plot_color_code_variables(data_ls)
grids = pipelearner::pipelearner(data_ls$data) %>%
pipelearner::learn_models( rpart::rpart, form ) %>%
pipelearner::learn_models( randomForest::randomForest, form ) %>%
pipelearner::learn_models( e1071::svm, form ) %>%
pipelearner::learn() %>%
dplyr::mutate( imp = map2(fit, train, f_model_importance)
, range_var = map_chr(imp, function(x) head(x,1)$row_names )
, grid = pmap( list( m = fit
, title = model
, variables = imp
, range_variable = range_var
, data = test
)
, f_model_plot_var_dep_over_spec_var_range
, formula = form
, data_ls = data_ls
, variable_color_code = variable_color_code
, log_y = F
, limit = 12
)
) %>%
.$grid
f_plot_obj_2_html( grids, type = "grids", output_file = 'test_me', title = 'Grids', height = 30 )
file.remove('test_me.html')
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.