mad_remove_outliers: Remove outliers using the MAD method

View source: R/mad_remove_outliers.R

mad_remove_outliersR Documentation

Remove outliers using the MAD method

Description

Detect outliers in a numeric vector using the Median Absolute Deviation (MAD) method and remove or convert them. For more information on MAD, see Leys et al. (2013) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jesp.2013.03.013")}

Usage

mad_remove_outliers(
  x = NULL,
  threshold = 2.5,
  constant = 1.4826,
  convert_outliers_to = NA,
  output_type = "converted_vector"
)

Arguments

x

a numeric vector

threshold

the threshold value for determining outliers. If threshold == 2.5, the median plus or minus 2.5 times the MAD will be the cutoff values for determining outliers. In other words, values less than the median minus 2.5 times the MAD and values greater than the median plus 2.5 times the MAD will be considered outliers. By default, threshold == 2.5

constant

scale factor for the 'mad' function in the 'stats' package. It is the constant linked to the assumed distribution. In case of normality, constant = 1.4826. By default, constant == 1.4826.

convert_outliers_to

the value to which outliers will be converted. For example, if convert_outliers_to = NA, the outlier values will be converted to NA values. If convert_outliers_to = 1000, the outlier values will be converted to 1000. By default, convert_outliers_to == NA.

output_type

type of the output. If output_type = "converted_vector", the function's output will be a vector with outliers converted to the value set by the argument convert_outliers_to. If output_type = "outliers", the function's output will be outliers in the original vector as determined by the MAD method. If output_type = "cutoff_values", the function's output will be the cutoff values for determining outliers. For example, if outliers will be values less than 0 and greater than 10, the cutoff values will be 0 and 10. If output_type = "non_outlier_values", the function's output will be a vector consisting only of the values that are not outliers; here, the outliers will be removed from the vector, rather than being converted to NA values. By default, output_type = "converted_vector".

Examples

## Not run: 
mad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000))
mad_remove_outliers(x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000))
# return the vector with the outlier converted to NA values
mad_remove_outliers(
x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),
output_type = "converted_vector")
# return the cutoff values for determining outliers
mad_remove_outliers(
x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),
output_type = "cutoff_values")
# return the outliers
mad_remove_outliers(
x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),
output_type = "outliers")
mad_remove_outliers(
x = c(1, 3, 3, 6, 8, 10, 10, 1000, -10000),
output_type = "non_outlier_values")

## End(Not run)

kim documentation built on Oct. 9, 2023, 5:08 p.m.