pipe_range_classifier: Generates features for regression problems through...
In jeroenvdhoven/datapiper: datapiper

Description Usage Arguments Details Value

Use classification models to classify if the response is larger than a series of given values for regression tasks

pipe_range_classifier(train, response, exclude_columns = response,
  base_temporary_column_name = "base_temporary_column_name",
  base_definitive_column_name = paste0(response, "_quantile"),
  quantiles = 0, even_spreads = 0, values, model = c("glm",
  "xgboost")[1], controls)

`train`	The train dataset, as a data.frame or data.table. Data.tables may be changed by reference.
`response`	String denoting the name of the column that should be used as the response variable.
`exclude_columns`	Columns that shouldn't be used in the models. Defaults to the response column and will ALWAYS include the response column.
`base_temporary_column_name`	Base name that will be used to create a temporary variable for training the classifier. Use this to ensure no existing columns are overwritten.
`base_definitive_column_name`	Base name that will be used to store the predictions of the created classifiers. Will be appended by the threshold value. Use this to ensure no existing columns are overwritten.
`quantiles`	Number of quantiles to use to generate threshold values. Will actually generate `quantiles+2` quantiles and look at 2nd to `quantiles+1`-th quantiles to remove non-sensical thresholds. Non-negative integer, defaults to 0.
`even_spreads`	Number of evenly spread thresholds to use. These will be based on the minimum and maximum value of the response in `train`. Defines its thresholds simarly to `quantiles` Non-negative integer, defaults to 0.
`values`	Threshold values to use. We will check if these fall in the range of the response in `train`.
`model`	Type of model to use. Currently only binomial glm and xgboost are available.
`controls`	Parameters for the models to use. Leave empty or set to NA to use defaults: glm: `glm.control` xgboost: see `xgb.train`