PipeOpImpute | R Documentation |
Abstract base class for feature imputation.
Abstract R6Class
object inheriting from PipeOp
.
PipeOpImpute$$new(id, param_set = ps(), param_vals = list(), whole_task_dependent = FALSE, packages = character(0), task_type = "Task")
id
:: character(1)
Identifier of resulting object. See $id
slot of PipeOp
.
param_set
:: ParamSet
Parameter space description. This should be created by the subclass and given to super$initialize()
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings given in param_set
. The
subclass should have its own param_vals
parameter and pass it on to super$initialize()
. Default list()
.
whole_task_dependent
:: logical(1)
Whether the context_columns
parameter should be added which lets the user limit the columns that are
used for imputation inference. This should generally be FALSE
if imputation depends only on individual features
(e.g. mode imputation), and TRUE
if imputation depends on other features as well (e.g. kNN-imputation).
packages :: character
Set of all required packages for the PipeOp
's private$.train
and private$.predict
methods. See $packages
slot.
Default is character(0)
.
task_type
:: character(1)
The class of Task
that should be accepted as input and will be returned as output. This
should generally be a character(1)
identifying a type of Task
, e.g. "Task"
, "TaskClassif"
or
"TaskRegr"
(or another subclass introduced by other packages). Default is "Task"
.
feature_types
:: character
Feature types affected by the PipeOp
. See private$.select_cols()
for more information.
PipeOpImpute
has one input channel named "input"
, taking a Task
, or a subclass of
Task
if the task_type
construction argument is given as such; both during training and prediction.
PipeOpImpute
has one output channel named "output"
, producing a Task
, or a subclass;
the Task
type is the same as for input; both during training and prediction.
The output Task
is the modified input Task
with features imputed according to the private$.impute()
function.
The $state
is a named list
; besides members added by inheriting classes, the members are:
affected_cols
:: character
Names of features being selected by the affect_columns
parameter.
context_cols
:: character
Names of features being selected by the context_columns
parameter.
intasklayout
:: data.table
Copy of the training Task
's $feature_types
slot. This is used during prediction to ensure that
the prediction Task
has the same features, feature layout, and feature types as during training.
outtasklayout
:: data.table
Copy of the trained Task
's $feature_types
slot. This is used during prediction to ensure that
the Task
resulting from the prediction operation has the same features, feature layout, and feature types as after training.
model
:: named list
Model used for imputation. This is a list named by Task
features, containing the result of the private$.train_imputer()
or
private$.train_nullmodel()
function for each one.
imputed_train
:: character
Names of features that were imputed during training. This is used to ensure that factor levels that were added during training are also added during prediction.
Note that features that are imputed during prediction but not during training will still have inconsistent factor levels.
affect_columns
:: function
| Selector
| NULL
What columns the PipeOpImpute
should operate on.
The parameter must be a Selector
function, which takes a Task
as argument and returns a character
of features to use.
See Selector
for example functions. Defaults to NULL
, which selects all features.
context_columns
:: function
| Selector
| NULL
What columns the PipeOpImpute
imputation may depend on. This parameter is only present if the constructor is called with
the whole_task_dependent
argument set to TRUE
.
The parameter must be a Selector
function, which takes a Task
as argument and returns a character
of features to use.
See Selector
for example functions. Defaults to NULL
, which selects all features.
PipeOpImpute
is an abstract class inheriting from PipeOp
that makes implementing imputer PipeOp
s simple.
Fields inherited from PipeOp
.
Methods inherited from PipeOp
, as well as:
.select_cols(task)
(Task
) -> character
Selects which columns the PipeOp
operates on. In contrast to
the affect_columns
parameter. private$.select_cols()
is for the inheriting class to determine which columns
the operator should function on, e.g. based on feature type, while affect_columns
is a way for the user
to limit the columns that a PipeOpTaskPreproc
should operate on.
This method can optionally be overloaded when inheriting PipeOpImpute
;
If this method is not overloaded, it defaults to selecting the columns of type indicated by the feature_types
construction argument.
.train_imputer(feature, type, context)
(atomic
, character(1)
, data.table
) -> any
Abstract function that must be overloaded when inheriting.
Called once for each feature selected by affect_columns
to create the model entry to be used for private$.impute()
. This function
is only called for features with at least one non-missing value.
.train_nullmodel(feature, type, context)
(atomic
, character(1)
, data.table
) -> any
Like .train_imputer()
, but only called for each feature that only contains missing values. This is not an abstract function
and, if not overloaded, gives a default response of 0
(integer
, numeric
), c(TRUE, FALSE)
(logical
), all available levels (factor
/ordered
),
or the empty string (character
).
.impute(feature, type, model, context)
(atomic
, character(1)
, any
, data.table
) -> atomic
Imputes the features. model
is the model created by private$.train_imputer()
Default behaviour is to assume model
is an atomic vector
from which values are sampled to impute missing values of feature
. model
may have an attribute probabilities
for non-uniform sampling.
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp
,
PipeOpEnsemble
,
PipeOpTargetTrafo
,
PipeOpTaskPreproc
,
PipeOpTaskPreprocSimple
,
mlr_pipeops
,
mlr_pipeops_adas
,
mlr_pipeops_blsmote
,
mlr_pipeops_boxcox
,
mlr_pipeops_branch
,
mlr_pipeops_chunk
,
mlr_pipeops_classbalancing
,
mlr_pipeops_classifavg
,
mlr_pipeops_classweights
,
mlr_pipeops_colapply
,
mlr_pipeops_collapsefactors
,
mlr_pipeops_colroles
,
mlr_pipeops_copy
,
mlr_pipeops_datefeatures
,
mlr_pipeops_encode
,
mlr_pipeops_encodeimpact
,
mlr_pipeops_encodelmer
,
mlr_pipeops_featureunion
,
mlr_pipeops_filter
,
mlr_pipeops_fixfactors
,
mlr_pipeops_histbin
,
mlr_pipeops_ica
,
mlr_pipeops_imputeconstant
,
mlr_pipeops_imputehist
,
mlr_pipeops_imputelearner
,
mlr_pipeops_imputemean
,
mlr_pipeops_imputemedian
,
mlr_pipeops_imputemode
,
mlr_pipeops_imputeoor
,
mlr_pipeops_imputesample
,
mlr_pipeops_kernelpca
,
mlr_pipeops_learner
,
mlr_pipeops_missind
,
mlr_pipeops_modelmatrix
,
mlr_pipeops_multiplicityexply
,
mlr_pipeops_multiplicityimply
,
mlr_pipeops_mutate
,
mlr_pipeops_nmf
,
mlr_pipeops_nop
,
mlr_pipeops_ovrsplit
,
mlr_pipeops_ovrunite
,
mlr_pipeops_pca
,
mlr_pipeops_proxy
,
mlr_pipeops_quantilebin
,
mlr_pipeops_randomprojection
,
mlr_pipeops_randomresponse
,
mlr_pipeops_regravg
,
mlr_pipeops_removeconstants
,
mlr_pipeops_renamecolumns
,
mlr_pipeops_replicate
,
mlr_pipeops_rowapply
,
mlr_pipeops_scale
,
mlr_pipeops_scalemaxabs
,
mlr_pipeops_scalerange
,
mlr_pipeops_select
,
mlr_pipeops_smote
,
mlr_pipeops_smotenc
,
mlr_pipeops_spatialsign
,
mlr_pipeops_subsample
,
mlr_pipeops_targetinvert
,
mlr_pipeops_targetmutate
,
mlr_pipeops_targettrafoscalerange
,
mlr_pipeops_textvectorizer
,
mlr_pipeops_threshold
,
mlr_pipeops_tunethreshold
,
mlr_pipeops_unbranch
,
mlr_pipeops_updatetarget
,
mlr_pipeops_vtreat
,
mlr_pipeops_yeojohnson
Other Imputation PipeOps:
mlr_pipeops_imputeconstant
,
mlr_pipeops_imputehist
,
mlr_pipeops_imputelearner
,
mlr_pipeops_imputemean
,
mlr_pipeops_imputemedian
,
mlr_pipeops_imputemode
,
mlr_pipeops_imputeoor
,
mlr_pipeops_imputesample
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.