| PipeOpEncodePL | R Documentation |
Abstract base class for piecewise linear encoding.
Piecewise linear encoding works by splitting values of features into distinct bins, through an algorithm implemented
in private$.get_bins(), and then creating new feature columns through a continuous alternative to one-hot encoding.
Here, one new feature per bin is constructed, with values being either
0, if the original value was below the lower bin boundary,
1, if the original value was above or equal to the upper bin boundary, or
a scaled value between 0 and 1, if the original value was inside the bin boundaries. Scaling is done by
offsetting the original value by the lower bin boundary and dividing by the bin width.
PipeOps inheriting from this encode columns of type numeric and integer. Use the PipeOpTaskPreproc
$affect_columns functionality to only encode a subset of columns, or only encode columns of a certain type, etc.
Abstract R6Class object inheriting from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp.
PipeOpEncodePL$new(id = "encodepl", param_set = ps(), param_vals = list(), packages = character(0), task_type = "Task")
id :: character(1)
Identifier of resulting object. See $id slot of PipeOp.
param_set :: ParamSet
Parameter space description. This should be created by the subclass and given to super$initialize().
param_vals :: named list
List of hyperparameter settings, overwriting the hyperparameter settings given in param_set. The
subclass should have its own param_vals parameter and pass it on to super$initialize(). Default list().
packages :: character
Set of all required packages for the PipeOp's private$.train() and private$.predict() methods. See $packages slot.
Default is character(0).
task_type :: character(1)
The class of Task that should be accepted as input and will be returned as output. This
should generally be a character(1) identifying a type of Task, e.g. "Task", "TaskClassif" or
"TaskRegr" (or another subclass introduced by other packages). Default is "Task".
Input and output channels are inherited from PipeOpTaskPreproc.
The output is the input Task with all affected numeric and integer columns encoded using piecewise linear encoding.
The $state is a named list with the $state elements inherited from PipeOpTaskPreproc, as well as:
bins :: named list
Named list of numeric vectors. Each element corresponds to and is named after one of the affected feature columns
and contains the bin boundaries derived through private$.get_bins().
The parameters are the parameters inherited from PipeOpTaskPreproc.
PipeOpEncodePL is an abstract class inheriting from PipeOpTaskPreprocSimple that allows easier implementation
of different binning algorithms for piecewise linear encoding. The respective binning algorithm should be implemented
as private$.get_bins().
Only fields inherited from PipeOp.
Methods inherited from PipeOpTaskPreprocSimple/PipeOpTaskPreproc/PipeOp as well as
.get_bins(task, cols)
(Task, character) -> named list
Abstract method for splitting the value range of a feature column into distinct bins. The argument cols should
give the names of the feature columns of the task for which bins should be derived. Returns a named list of
numeric vectors containing the bin boundaries for each affected feature column, named by that corresponding feature
column.
Gorishniy Y, Rubachev I, Babenko A (2022). “On Embeddings for Numerical Features in Tabular Deep Learning.” In Advances in Neural Information Processing Systems, volume 35, 24991–25004. https://proceedings.neurips.cc/paper_files/paper/2022/hash/9e9f0ffc3d836836ca96cbf8fe14b105-Abstract-Conference.html.
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp,
PipeOpEnsemble,
PipeOpImpute,
PipeOpTargetTrafo,
PipeOpTaskPreproc,
PipeOpTaskPreprocSimple,
mlr_pipeops,
mlr_pipeops_adas,
mlr_pipeops_blsmote,
mlr_pipeops_boxcox,
mlr_pipeops_branch,
mlr_pipeops_chunk,
mlr_pipeops_classbalancing,
mlr_pipeops_classifavg,
mlr_pipeops_classweights,
mlr_pipeops_colapply,
mlr_pipeops_collapsefactors,
mlr_pipeops_colroles,
mlr_pipeops_copy,
mlr_pipeops_datefeatures,
mlr_pipeops_decode,
mlr_pipeops_encode,
mlr_pipeops_encodeimpact,
mlr_pipeops_encodelmer,
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree,
mlr_pipeops_featureunion,
mlr_pipeops_filter,
mlr_pipeops_fixfactors,
mlr_pipeops_histbin,
mlr_pipeops_ica,
mlr_pipeops_imputeconstant,
mlr_pipeops_imputehist,
mlr_pipeops_imputelearner,
mlr_pipeops_imputemean,
mlr_pipeops_imputemedian,
mlr_pipeops_imputemode,
mlr_pipeops_imputeoor,
mlr_pipeops_imputesample,
mlr_pipeops_info,
mlr_pipeops_isomap,
mlr_pipeops_kernelpca,
mlr_pipeops_learner,
mlr_pipeops_learner_pi_cvplus,
mlr_pipeops_learner_quantiles,
mlr_pipeops_missind,
mlr_pipeops_modelmatrix,
mlr_pipeops_multiplicityexply,
mlr_pipeops_multiplicityimply,
mlr_pipeops_mutate,
mlr_pipeops_nearmiss,
mlr_pipeops_nmf,
mlr_pipeops_nop,
mlr_pipeops_ovrsplit,
mlr_pipeops_ovrunite,
mlr_pipeops_pca,
mlr_pipeops_proxy,
mlr_pipeops_quantilebin,
mlr_pipeops_randomprojection,
mlr_pipeops_randomresponse,
mlr_pipeops_regravg,
mlr_pipeops_removeconstants,
mlr_pipeops_renamecolumns,
mlr_pipeops_replicate,
mlr_pipeops_rowapply,
mlr_pipeops_scale,
mlr_pipeops_scalemaxabs,
mlr_pipeops_scalerange,
mlr_pipeops_select,
mlr_pipeops_smote,
mlr_pipeops_smotenc,
mlr_pipeops_spatialsign,
mlr_pipeops_subsample,
mlr_pipeops_targetinvert,
mlr_pipeops_targetmutate,
mlr_pipeops_targettrafoscalerange,
mlr_pipeops_textvectorizer,
mlr_pipeops_threshold,
mlr_pipeops_tomek,
mlr_pipeops_tunethreshold,
mlr_pipeops_unbranch,
mlr_pipeops_updatetarget,
mlr_pipeops_vtreat,
mlr_pipeops_yeojohnson
Other Piecewise Linear Encoding PipeOps:
mlr_pipeops_encodeplquantiles,
mlr_pipeops_encodepltree
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.