clean_outliers: Outliers cleaning wrapper

Description Usage Arguments Value Examples

View source: R/outliers.R

Description

Outliers cleaning wrapper

Usage

1
clean_outliers(dataset, method, ...)

Arguments

dataset

we want to clean outliers of

method

selected method to clean outliers. Possibilities are:

  • "univariate" detects outliers column by column (an outlier will be an abnormal value inside a column) and fills them with mean or median of the corresponding column

  • "multivariate" detects outliers using a multicolumn approach, so that an outlier will be a whole observation (row). And deletes those observations

...

further arguments for the method

Value

The treated dataset (either with outliers replaced or erased)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
library("smartdata")

super_iris <- clean_outliers(iris, method = "multivariate", type = "adj")
super_iris <- clean_outliers(iris, method = "multivariate", type = "quan")

# Use mean as method to substitute outliers
super_iris <- clean_outliers(iris, method = "univariate", type = "z", prob = 0.9, fill = "mean")
# Use median as method to substitute outliers
super_iris <- clean_outliers(iris, method = "univariate", type = "z", prob = 0.9, fill = "median")
# Use chi-sq instead of z p-values
super_iris <- clean_outliers(iris, method = "univariate", type = "chisq",
                             prob = 0.9, fill = "median")
# Use interquartilic range instead (lim argument is mandatory when using it)
super_iris <- clean_outliers(iris, method = "univariate", type = "iqr", lim = 0.9, fill = "median")

smartdata documentation built on Dec. 19, 2019, 1:08 a.m.