mlr_pipeops_tomek | R Documentation |
Generates a cleaner data set by removing all majority-minority Tomek links.
The algorithm down-samples the data by removing all pairs of observations that form a Tomek link, i.e. a pair of observations that are nearest neighbors and belong to different classes. For this only numeric and integer features are taken into account. These must have no missing values.
This can only be applied to classification tasks. Multiclass classification is supported.
See themis::tomek
for details.
R6Class
object inheriting from PipeOpTaskPreproc
/PipeOp
.
PipeOpTomek$new(id = "tomek", param_vals = list())
id
:: character(1)
Identifier of resulting object, default "tomek"
.
param_vals
:: named list
List of hyperparameter settings, overwriting the hyperparameter settings that would otherwise be set during construction. Default list()
.
Input and output channels are inherited from PipeOpTaskPreproc
. Instead of a Task
, a
TaskClassif
is used as input and output during training and prediction.
The output during training is the input Task
with removed rows for pairs of observations that form a Tomek link.
The output during prediction is the unchanged input.
The $state
is a named list
with the $state
elements inherited from PipeOpTaskPreproc
.
The parameters are the parameters inherited from PipeOpTaskPreproc
.
Only fields inherited from PipeOp
.
Only methods inherited from PipeOpTaskPreproc
/PipeOp
.
Tomek I (1976). “Two Modifications of CNN.” IEEE Transactions on Systems, Man and Cybernetics, 6(11), 769–772. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/TSMC.1976.4309452")}.
https://mlr-org.com/pipeops.html
Other PipeOps:
PipeOp
,
PipeOpEncodePL
,
PipeOpEnsemble
,
PipeOpImpute
,
PipeOpTargetTrafo
,
PipeOpTaskPreproc
,
PipeOpTaskPreprocSimple
,
mlr_pipeops
,
mlr_pipeops_adas
,
mlr_pipeops_blsmote
,
mlr_pipeops_boxcox
,
mlr_pipeops_branch
,
mlr_pipeops_chunk
,
mlr_pipeops_classbalancing
,
mlr_pipeops_classifavg
,
mlr_pipeops_classweights
,
mlr_pipeops_colapply
,
mlr_pipeops_collapsefactors
,
mlr_pipeops_colroles
,
mlr_pipeops_copy
,
mlr_pipeops_datefeatures
,
mlr_pipeops_decode
,
mlr_pipeops_encode
,
mlr_pipeops_encodeimpact
,
mlr_pipeops_encodelmer
,
mlr_pipeops_encodeplquantiles
,
mlr_pipeops_encodepltree
,
mlr_pipeops_featureunion
,
mlr_pipeops_filter
,
mlr_pipeops_fixfactors
,
mlr_pipeops_histbin
,
mlr_pipeops_ica
,
mlr_pipeops_imputeconstant
,
mlr_pipeops_imputehist
,
mlr_pipeops_imputelearner
,
mlr_pipeops_imputemean
,
mlr_pipeops_imputemedian
,
mlr_pipeops_imputemode
,
mlr_pipeops_imputeoor
,
mlr_pipeops_imputesample
,
mlr_pipeops_kernelpca
,
mlr_pipeops_learner
,
mlr_pipeops_learner_pi_cvplus
,
mlr_pipeops_learner_quantiles
,
mlr_pipeops_missind
,
mlr_pipeops_modelmatrix
,
mlr_pipeops_multiplicityexply
,
mlr_pipeops_multiplicityimply
,
mlr_pipeops_mutate
,
mlr_pipeops_nearmiss
,
mlr_pipeops_nmf
,
mlr_pipeops_nop
,
mlr_pipeops_ovrsplit
,
mlr_pipeops_ovrunite
,
mlr_pipeops_pca
,
mlr_pipeops_proxy
,
mlr_pipeops_quantilebin
,
mlr_pipeops_randomprojection
,
mlr_pipeops_randomresponse
,
mlr_pipeops_regravg
,
mlr_pipeops_removeconstants
,
mlr_pipeops_renamecolumns
,
mlr_pipeops_replicate
,
mlr_pipeops_rowapply
,
mlr_pipeops_scale
,
mlr_pipeops_scalemaxabs
,
mlr_pipeops_scalerange
,
mlr_pipeops_select
,
mlr_pipeops_smote
,
mlr_pipeops_smotenc
,
mlr_pipeops_spatialsign
,
mlr_pipeops_subsample
,
mlr_pipeops_targetinvert
,
mlr_pipeops_targetmutate
,
mlr_pipeops_targettrafoscalerange
,
mlr_pipeops_textvectorizer
,
mlr_pipeops_threshold
,
mlr_pipeops_tunethreshold
,
mlr_pipeops_unbranch
,
mlr_pipeops_updatetarget
,
mlr_pipeops_vtreat
,
mlr_pipeops_yeojohnson
library("mlr3")
# Create example task
task = tsk("iris")
task$head()
table(task$data(cols = "Species"))
# Down-sample data
pop = po("tomek")
tomek_result = pop$train(list(task))[[1]]$data()
nrow(tomek_result)
table(tomek_result$Species)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.