detect_outlier2: Advanced Outlier Detection in Numeric Data with Optional...

View source: R/detect_outlier.R

detect_outlier2R Documentation

Advanced Outlier Detection in Numeric Data with Optional Grouping

Description

Detects outliers in numeric data using multiple statistical methods and provides comprehensive analysis including visualizations and summary statistics. Supports grouped analysis when group vector is provided. The function supports three detection methods: IQR, Z-score, and Modified Z-score.

Usage

detect_outlier2(
  x,
  groups = NULL,
  method = c("iqr", "zscore", "modified_zscore", "all"),
  multiplier = 1.5,
  z_threshold = 3,
  modified_z_threshold = 3.5,
  plot = TRUE,
  plot_groups = TRUE
)

Arguments

x

A numeric vector containing the data to analyze

groups

Optional vector of group labels, must be same length as x

method

Character string specifying the outlier detection method. Options are "iqr", "zscore", "modified_zscore", or "all". Default is "iqr"

multiplier

Numeric value specifying the IQR multiplier for outlier detection. Default is 1.5

z_threshold

Numeric value specifying the Z-score threshold. Default is 3

modified_z_threshold

Numeric value specifying the modified Z-score threshold. Default is 3.5

plot

Logical indicating whether to generate visualization plots. Default is TRUE

plot_groups

Logical indicating whether to create separate plots for each group. Only used when groups are provided. Default is TRUE

Value

A list of class "outlier_analysis" containing:

  • overall: Overall analysis results

  • by_group: Group-specific results (if groups provided)

  • plots: List of generated plots

Examples

# Example 1: Basic grouped analysis
set.seed(123)
data <- c(rnorm(50), rnorm(50, 2), rnorm(50, 4))
groups <- rep(c("A", "B", "C"), each = 50)
resultA <- detect_outlier2(data) # no groups
resultB <- detect_outlier2(data, groups = groups) # with groups

# Example 2: Custom thresholds by group
test_scores <- c(65, 70, 75, 72, 68, 73, 78, 71, 69, 74,
                 90, 85, 92, 88, 95, 87, 91, 89, 86, 93)
class_groups <- rep(c("Morning", "Afternoon"), each = 10)
result <- detect_outlier2(test_scores,
                                   groups = class_groups,
                                   method = "all",
                                   z_threshold = 2)
result$overall
result$by_group
result$plots$overall$boxplot()
result$plots$overall$density()
result$plots$overall$comparison()
result$plots$by_group$Morning$boxplot()
result$plots$by_group$Morning$density()
result$plots$by_group$Afternoon$boxplot()
result$plots$by_group$Afternoon$density()


quickcode documentation built on April 11, 2025, 5:49 p.m.