outliers.check: Check and Clean Outliers in a Numeric Vector

View source: R/outliers.check.r

outliers.checkR Documentation

Check and Clean Outliers in a Numeric Vector

Description

Identifies and optionally replaces outliers in a numeric vector using the Interquartile Range (IQR) method. Outliers are defined as values that fall outside Q1 - k*IQR or Q3 + k*IQR, where k is a user-defined threshold.

Usage

outliers.check(raw.vector, sd.thresh = 1.5, fun.replace = "mean", logical = F)

Arguments

raw.vector

A numeric vector to check for outliers

sd.thresh

Numeric value specifying the threshold multiplier for the IQR (default = 1.5)

fun.replace

Character string specifying the replacement method: either "mean" or "median" (default = "mean")

logical

Logical value. If TRUE, returns a factor indicating outlier status (1 = outlier, 0 = not outlier). If FALSE, returns the cleaned vector (default = FALSE)

Details

The function uses the following method to identify outliers: * Calculates Q1 (25th percentile), Q3 (75th percentile), and IQR * Identifies values outside: [Q1 - sd.thresh*IQR, Q3 + sd.thresh*IQR] * Replaces outliers with either mean or median of the original vector

Value

If logical = FALSE (default), returns a numeric vector with outliers replaced by either the mean or median of the original vector. If logical = TRUE, returns a factor vector where 1 indicates outliers and 0 indicates non-outliers.

Examples

# Clean outliers using mean replacement
x <- c(1, 2, 3, 100, 2, 3, 4, -50)
outliers.check(x)

# Get logical vector of outlier positions
outliers.check(x, logical = TRUE)

# Clean outliers using median with different threshold
outliers.check(x, sd.thresh = 2, fun.replace = "median")


ccamp83/mu documentation built on Nov. 7, 2024, 5:17 p.m.