imputation_simple: Simple imputation

View source: R/trans_imputation_simple.R

imputation_simpleR Documentation

Simple imputation

Description

Impute missing values in mixed datasets using simple statistics.

Usage

imputation_simple(method = c("median", "mean"), cols = NULL)

Arguments

method

imputation method for numeric columns: "median" or "mean"

cols

optional vector of column names to impute (default: all supported columns)

Details

Numeric columns are imputed with the mean or median. Factor, character, logical, and ordered columns are imputed with the mode (most frequent observed value). This class is intended as a low-complexity baseline for preprocessing workflows. The default recommendation of median for numeric variables follows standard data preprocessing guidance because it is less sensitive to outliers than the mean, while mode imputation is the usual baseline for categorical attributes.

Value

returns an object of class imputation_simple

References

Han, J., Kamber, M., Pei, J. (2011). Data Mining: Concepts and Techniques.

Little, R. J. A., Rubin, D. B. (2019). Statistical Analysis with Missing Data.

Examples

data(iris)
iris_na <- iris
iris_na$Sepal.Length[c(2, 10, 25)] <- NA
iris_na$Species[c(3, 15)] <- NA

imp <- imputation_simple(method = "median")
imp <- fit(imp, iris_na)
iris_imp <- transform(imp, iris_na)
summary(iris_imp$Sepal.Length)
table(iris_imp$Species, useNA = "ifany")

daltoolbox documentation built on May 14, 2026, 9:06 a.m.