| alluvial_model_response | R Documentation | 
alluvial plots are capable of displaying higher dimensional data
on a plane, thus lend themselves to plot the response of a statistical model
to changes in the input data across multiple dimensions. The practical limit
here is 4 dimensions. We need the data space (a sensible range of data
calculated based on the importance of the explanatory variables of the model
as created by get_data_space and the predictions
returned by the model in response to the data space.
alluvial_model_response(
  pred,
  dspace,
  imp,
  degree = 4,
  bin_labels = c("LL", "ML", "M", "MH", "HH"),
  col_vector_flow = c("#FF0065", "#009850", "#A56F2B", "#005EAA", "#710500", "#7B5380",
    "#9DD1D1"),
  method = "median",
  force = FALSE,
  params_bin_numeric_pred = list(bins = 5),
  pred_train = NULL,
  stratum_label_size = 3.5,
  ...
)
pred | 
 vector, predictions, if method = 'pdp' use
  | 
dspace | 
 data frame, returned by
  | 
imp | 
 dataframe, with not more then two columns one of them numeric containing importance measures and one character or factor column containing corresponding variable names as found in training data.  | 
degree | 
 integer, number of top important variables to select. For plotting more than 4 will result in two many flows and the alluvial plot will not be very readable, Default: 4  | 
bin_labels | 
 labels for prediction bins from low to high, Default: c("LL", "ML", "M", "MH", "HH")  | 
col_vector_flow | 
 character vector, defines flow colours, Default: c('#FF0065','#009850', '#A56F2B', '#005EAA', '#710500')  | 
method | 
 character vector, one of c('median', 'pdp') 
 . Default: 'median'  | 
force | 
 logical, force plotting of over 1500 flows, Default: FALSE  | 
params_bin_numeric_pred | 
 list, additional parameters passed to
  | 
pred_train | 
 numeric vector, base the automated binning of the pred vector on the distribution of the training predictions. This is useful if marginal histograms are added to the plot later. Default = NULL  | 
stratum_label_size | 
 numeric, Default: 3.5  | 
... | 
 additional parameters passed to
  | 
this model visualisation approach follows the "visualising the model in the dataspace" principle as described in Wickham H, Cook D, Hofmann H (2015) Visualizing statistical models: Removing the blindfold. Statistical Analysis and Data Mining 8(4) <doi:10.1002/sam.11271>
ggplot2 object
alluvial_wide,
get_data_space,
alluvial_model_response_caret
df = mtcars2[, ! names(mtcars2) %in% 'ids' ]
m = randomForest::randomForest( disp ~ ., df)
imp = m$importance
dspace = get_data_space(df, imp, degree = 3)
pred = predict(m, newdata = dspace)
alluvial_model_response(pred, dspace, imp, degree = 3)
# partial dependency plotting method
## Not run: 
 pred = get_pdp_predictions(df, imp
                            , .f_predict = randomForest:::predict.randomForest
                            , m
                            , degree = 3
                            , bins = 5)
 alluvial_model_response(pred, dspace, imp, degree = 3, method = 'pdp')
 
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.