Splits numeric features into equally spaced bins.
See graphics::hist()
for details.
Values that fall out of the training data range during prediction are
binned with the lowest / highest bin respectively.
R6Class
object inheriting from PipeOpTaskPreprocSimple
/PipeOpTaskPreproc
/PipeOp
.
PipeOpHistBin$new(id = "histbin", param_vals = list())
id
:: character(1)
Identifier of resulting object, default "histbin"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
Input and output channels are inherited from PipeOpTaskPreproc
.
The output is the input Task
with all affected numeric features replaced by their binned versions.
The $state
is a named list
with the $state
elements inherited from PipeOpTaskPreproc
, as well as:
breaks
:: list
List of intervals representing the bins for each numeric feature.
The parameters are the parameters inherited from PipeOpTaskPreproc
, as well as:
breaks
:: character(1)
| numeric
| function
Either a character(1)
string naming an algorithm to compute the number of cells,
a numeric(1)
giving the number of breaks for the histogram,
a vector numeric
giving the breakpoints between the histogram cells, or
a function
to compute the vector of breakpoints or to compute the number
of cells. Default is algorithm "Sturges"
(see grDevices::nclass.Sturges()
).
For details see hist()
.
Uses the graphics::hist
function.
Only methods inherited from PipeOpTaskPreprocSimple
/PipeOpTaskPreproc
/PipeOp
.
https://mlr-org.com/pipeops.html
library("mlr3")
task = tsk("iris")
pop = po("histbin")
task$data()
pop$train(list(task))[[1]]$data()
pop$state
