removeOLs: Remove outlying observations from a data.frame or matrix

View source: R/faststats.R

removeOLsR Documentation

Remove outlying observations from a data.frame or matrix

Description

Outliers are defined as values deviating more than X standard deviations (SDs) from the mean.

Usage

removeOLs(.tbl, olvars = NULL, groups = NULL, s = 3, make.na = FALSE)

Arguments

.tbl

A data.frame or matrix to exclude outliers from

olvars

Names or numeric index of the variables to detect outliers in. If NULL, all variables will be checked for outliers.

groups

(optional) name or numeric index of the variable identifying groups of observations; outlier detection will be performed separately per group.

s

If a value deviates more SDs from the mean than this value, it is marked as an outlier

make.na

If FALSE (default), excludes all rows that have an outlier in at least one variable in olvars (listwise). If TRUE, the function instead turns the individual outlying values into NA, and does not exclude any rows.

Details

This does not detect any outliers in groups with less than 3 non-NA observations.

Value

The input data.frame or matrix with outliers excluded.

Author(s)

Sercan Kahveci

See Also

[vec.removeOLs()] for the same outlier exclusion applied to a single vector.

Examples

# Standard deviation limits can be set with argument s
removeOLs(mtcars, olvars=c("mpg", "disp", "hp"))
removeOLs(mtcars, olvars=c("mpg", "disp", "hp"), s=1)

# Replace OLs with NA with argument make.na
testdata <- mtcars
testdata$mpg[1] <- 40
testdata$hp[2] <- 500
removeOLs(testdata, olvars=c("mpg", "disp", "hp"), groups="vs", make.na=TRUE)

# Also works on matrices
testmat <- matrix(rnorm(1000), ncol=5)
testmat[cbind(sample(1:200,5),1:5)]<-1000
removeOLs(testmat)


Spiritspeak/skMisc documentation built on April 12, 2025, 5:40 a.m.