calculate_variable_split: Internal Function for Split Points for Selected Variables

Description Usage Arguments Details Value

View source: R/calculate_variable_profile.R

Description

This function calculate candidate splits for each selected variable. For numerical variables splits are calculated as percentiles (in general uniform quantiles of the length grid_points). For all other variables splits are calculated as unique values.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
calculate_variable_split(
  data,
  variables = colnames(data),
  grid_points = 101,
  variable_splits_type = "quantiles",
  new_observation = NA
)

## Default S3 method:
calculate_variable_split(
  data,
  variables = colnames(data),
  grid_points = 101,
  variable_splits_type = "quantiles",
  new_observation = NA
)

Arguments

data

validation dataset. Is used to determine distribution of observations.

variables

names of variables for which splits shall be calculated

grid_points

number of points used for response path

variable_splits_type

how variable grids shall be calculated? Use "quantiles" (default) for percentiles or "uniform" to get uniform grid of points

new_observation

if specified (not NA) then all values in new_observation will be included in variable_splits

Details

Note that calculate_variable_split function is S3 generic. If you want to work on non standard data sources (like H2O ddf, external databases) you should overload it.

Value

A named list with splits for selected variables


ingredients documentation built on April 10, 2021, 5:06 p.m.