mlr_pipeops_imputeoor: Out of Range Imputation
In mlr3pipelines: Preprocessing Operators and Pipelines for 'mlr3'

mlr_pipeops_imputeoor

R Documentation

Out of Range Imputation

Description

Impute factorial features by adding a new level ".MISSING".

Impute numerical features by constant values shifted below the minimum or above the maximum by using min(x) - offset - multiplier * diff(range(x)) or max(x) + offset + multiplier * diff(range(x)).

This type of imputation is especially sensible in the context of tree-based methods, see also Ding & Simonoff (2010).

If a factor is missing during prediction, but not during training, this adds an unseen level ".MISSING", which would be a problem for most models. This is why it is recommended to use po("fixfactors") and po("imputesample", affect_columns = selector_type(types = c("factor", "ordered"))) (or some other imputation method) after this imputation method, if missing values are expected during prediction in factor columns that had no missing values during training.

Format

R6Class object inheriting from PipeOpImpute/PipeOp.

Construction

PipeOpImputeOOR$new(id = "imputeoor", param_vals = list())

id :: character(1)
Identifier of resulting object, default "imputeoor".
param_vals :: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list().

Input and Output Channels

Input and output channels are inherited from PipeOpImpute.

The output is the input Task with all affected features having missing values imputed as described above.

State

The ⁠$state⁠ is a named list with the ⁠$state⁠ elements inherited from PipeOpImpute.

The ⁠$state$model⁠ contains either ".MISSING" used for character and factor (also ordered) features or numeric(1) indicating the constant value used for imputation of integer and numeric features.

Parameters

The parameters are the parameters inherited from PipeOpImpute, as well as:

min :: logical(1)
Should integer and numeric features be shifted below the minimum? Initialized to TRUE. If FALSE they are shifted above the maximum. See also the description above.
offset :: numeric(1)
Numerical non-negative offset as used in the description above for integer and numeric features. Initialized to 1.
multiplier :: numeric(1)
Numerical non-negative multiplier as used in the description above for integer and numeric features. Initialized to 1.

Internals

Adds an explicit new level() to factor and ordered features, but not to character features. For integer and numeric features uses the min, max, diff and range functions. integer and numeric features that are entirely NA are imputed as 0.

Fields

Only fields inherited from PipeOp.

Methods

Only methods inherited from PipeOpImpute/PipeOp.

References

Ding Y, Simonoff JS (2010). “An Investigation of Missing Data Methods for Classification Trees Applied to Binary Response Data.” Journal of Machine Learning Research, 11(6), 131-170. https://jmlr.org/papers/v11/ding10a.html.

Other PipeOps: PipeOp, PipeOpEncodePL, PipeOpEnsemble, PipeOpImpute, PipeOpTargetTrafo, PipeOpTaskPreproc, PipeOpTaskPreprocSimple, mlr_pipeops, mlr_pipeops_adas, mlr_pipeops_blsmote, mlr_pipeops_boxcox, mlr_pipeops_branch, mlr_pipeops_chunk, mlr_pipeops_classbalancing, mlr_pipeops_classifavg, mlr_pipeops_classweights, mlr_pipeops_colapply, mlr_pipeops_collapsefactors, mlr_pipeops_colroles, mlr_pipeops_copy, mlr_pipeops_datefeatures, mlr_pipeops_decode, mlr_pipeops_encode, mlr_pipeops_encodeimpact, mlr_pipeops_encodelmer, mlr_pipeops_encodeplquantiles, mlr_pipeops_encodepltree, mlr_pipeops_featureunion, mlr_pipeops_filter, mlr_pipeops_fixfactors, mlr_pipeops_histbin, mlr_pipeops_ica, mlr_pipeops_imputeconstant, mlr_pipeops_imputehist, mlr_pipeops_imputelearner, mlr_pipeops_imputemean, mlr_pipeops_imputemedian, mlr_pipeops_imputemode, mlr_pipeops_imputesample, mlr_pipeops_kernelpca, mlr_pipeops_learner, mlr_pipeops_learner_pi_cvplus, mlr_pipeops_learner_quantiles, mlr_pipeops_missind, mlr_pipeops_modelmatrix, mlr_pipeops_multiplicityexply, mlr_pipeops_multiplicityimply, mlr_pipeops_mutate, mlr_pipeops_nearmiss, mlr_pipeops_nmf, mlr_pipeops_nop, mlr_pipeops_ovrsplit, mlr_pipeops_ovrunite, mlr_pipeops_pca, mlr_pipeops_proxy, mlr_pipeops_quantilebin, mlr_pipeops_randomprojection, mlr_pipeops_randomresponse, mlr_pipeops_regravg, mlr_pipeops_removeconstants, mlr_pipeops_renamecolumns, mlr_pipeops_replicate, mlr_pipeops_rowapply, mlr_pipeops_scale, mlr_pipeops_scalemaxabs, mlr_pipeops_scalerange, mlr_pipeops_select, mlr_pipeops_smote, mlr_pipeops_smotenc, mlr_pipeops_spatialsign, mlr_pipeops_subsample, mlr_pipeops_targetinvert, mlr_pipeops_targetmutate, mlr_pipeops_targettrafoscalerange, mlr_pipeops_textvectorizer, mlr_pipeops_threshold, mlr_pipeops_tomek, mlr_pipeops_tunethreshold, mlr_pipeops_unbranch, mlr_pipeops_updatetarget, mlr_pipeops_vtreat, mlr_pipeops_yeojohnson

Other Imputation PipeOps: PipeOpImpute, mlr_pipeops_imputeconstant, mlr_pipeops_imputehist, mlr_pipeops_imputelearner, mlr_pipeops_imputemean, mlr_pipeops_imputemedian, mlr_pipeops_imputemode, mlr_pipeops_imputesample

Examples

library("mlr3")
set.seed(2409)
data = tsk("pima")$data()
data$y = factor(c(NA, sample(letters, size = 766, replace = TRUE), NA))
data$z = ordered(c(NA, sample(1:10, size = 767, replace = TRUE)))
task = TaskClassif$new("task", backend = data, target = "diabetes")
task$missings()
po = po("imputeoor")
new_task = po$train(list(task = task))[[1]]
new_task$missings()
new_task$data()

# recommended use when missing values are expected during prediction on
# factor columns that had no missing values during training
gr = po("imputeoor") %>>%
  po("fixfactors") %>>%
  po("imputesample", affect_columns = selector_type(types = c("factor", "ordered")))
t1 = as_task_classif(data.frame(l = as.ordered(letters[1:3]), t = letters[1:3]), target = "t")
t2 = as_task_classif(data.frame(l = as.ordered(c("a", NA, NA)), t = letters[1:3]), target = "t")
gr$train(t1)[[1]]$data()

# missing values during prediction are sampled randomly
gr$predict(t2)[[1]]$data()

mlr3pipelines documentation built on June 17, 2025, 9:08 a.m.

mlr3pipelines index

Package overview README.md Adding new PipeOps

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mlr3pipelines
Preprocessing Operators and Pipelines for 'mlr3'

mlr_pipeops_imputeoor: Out of Range Imputation
In mlr3pipelines: Preprocessing Operators and Pipelines for 'mlr3'

Out of Range Imputation

Description

Format

Construction

Input and Output Channels

State

Parameters

Internals

Fields

Methods

References

See Also

Examples

Related to mlr_pipeops_imputeoor in mlr3pipelines...

R Package Documentation

Browse R Packages

We want your feedback!

mlr3pipelines Preprocessing Operators and Pipelines for 'mlr3'

mlr_pipeops_imputeoor: Out of Range Imputation In mlr3pipelines: Preprocessing Operators and Pipelines for 'mlr3'

Out of Range Imputation

Description

Format

Construction

Input and Output Channels

State

Parameters

Internals

Fields

Methods

References

See Also

Examples

Related to mlr_pipeops_imputeoor in mlr3pipelines...

R Package Documentation

Browse R Packages

We want your feedback!

mlr3pipelines
Preprocessing Operators and Pipelines for 'mlr3'

mlr_pipeops_imputeoor: Out of Range Imputation
In mlr3pipelines: Preprocessing Operators and Pipelines for 'mlr3'