Description Usage Arguments Value Examples
View source: R/find_bad_apples.R
This function uses a univariate approach to outlier detection. For each column with outliers (values that are 2 or more standard deviations from the mean), this function will create a reference list of row indices with outliers, and the total number of outliers in that column.
Note: This function works best for small datasets with unimodal variable distributions.
1 |
df |
A dataframe containing numeric data |
A dataframe with columns for 'variable' (dataframe column name), 'total_outliers' (number of outliers in the column), and 'indices' (list of row indices with outliers)
1 2 3 4 5 6 7 8 | df <- data.frame('A' = c(1, 1, 1, 10, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
'B' = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 10))
find_bad_apples(df)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.