find_bad_apples: Find bad apples

Description Usage Arguments Value Examples

View source: R/find_bad_apples.R

Description

This function uses a univariate approach to outlier detection. For each column with outliers (values that are 2 or more standard deviations from the mean), this function will create a reference list of row indices with outliers, and the total number of outliers in that column.

Note: This function works best for small datasets with unimodal variable distributions.

Usage

1

Arguments

df

A dataframe containing numeric data

Value

A dataframe with columns for 'variable' (dataframe column name), 'total_outliers' (number of outliers in the column), and 'indices' (list of row indices with outliers)

Examples

1
2
3
4
5
6
7
8
df <- data.frame('A' = c(1, 1, 1, 10, 1, 1, 1, 1, 1, 1,
                         1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                         1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
                 'B' = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                         1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
                         1, 1, 1, 1, 1, 1, 1, 1, 1, 10))

find_bad_apples(df)

UBC-MDS/mealprepR documentation built on April 1, 2020, 4:36 a.m.