calculate_variable_splits: Calculate Split Points for Selected Variables

Description Usage Arguments Details Value Examples

View source: R/calculate_profiles.R

Description

This function calculate candidate splits for each selected variable. For numerical variables splits are calculated as percentiles (in general uniform quantiles of the length grid_points). For all other variables splits are calculated as unique values.

Usage

1
2
3
4
5
6
calculate_variable_splits(
  data,
  variables = colnames(data),
  grid_points = 101,
  variable_splits_type = "quantiles"
)

Arguments

data

validation dataset. Is used to determine distribution of observations.

variables

names of variables for which splits shall be calculated

grid_points

number of points used for response path

variable_splits_type

how variable grids shall be calculated? Use "quantiles" (default) for percentiles or "uniform" to get uniform grid of points

Details

Note that calculate_variable_splits function is S3 generic. If you want to work on non standard data sources (like H2O ddf, external databases) you should overload it.

Value

A named list with splits for selected variables

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
library("DALEX")
 ## Not run: 
library("randomForest")
set.seed(59)
apartments_rf_model <- randomForest(m2.price ~ construction.year + surface + floor +
                                      no.rooms + district, data = apartments)
vars <- c("construction.year", "surface", "floor", "no.rooms", "district")
calculate_variable_splits(apartments, vars)

## End(Not run)

pbiecek/WhatIfPlots documentation built on July 23, 2020, 9:15 p.m.