correct_outliers: Correct Outliers in a Dataset

View source: R/correct_outliers.R

correct_outliersR Documentation

Correct Outliers in a Dataset

Description

This function identifies and corrects outliers in a dataset using principal component analysis (PCA). It scales the data, performs PCA, computes idiosyncratic components, and replaces values that fall outside a defined outlier threshold with the median of 5 previous values. The outlier threshold is determined using the interquartile range (IQR) method.

Usage

correct_outliers(data, r)

Arguments

data

A numeric matrix or data frame where rows represent observations and columns represent variables.

r

An integer specifying the number of principal components to use for PCA.

Value

A list containing:

data

A matrix with corrected data where outliers are replaced by the median of previous values.

outliers

A binary matrix (same dimensions as the input data) indicating the position of outliers.

Examples

data <- matrix(rnorm(100), nrow = 10, ncol = 10)
result <- correct_outliers(data, r = 3)
corrected_data <- result$data
outliers_matrix <- result$outliers


FARS documentation built on Aug. 8, 2025, 7:33 p.m.