Description Usage Arguments Functions See Also Examples
Returns a logical vector where TRUE indicates outliers.
1 2 3 4 5 6 7 8 9 | outlier_tukey(
x,
k = 1.5,
ignore_lwr = FALSE,
apply_log = FALSE,
ignore_zero = FALSE
)
outlier_tukey_top(x, k = 1.5, apply_log = FALSE, ignore_zero = FALSE)
|
x |
input values to check |
k |
the iqr multiplier that determines the fence level. Increasing will make outlier identification less strict (& vice-versa) |
ignore_lwr |
If TRUE, don't use the lower fence for identifying outliers |
apply_log |
If TRUE, log transform input values prior to applying tukey's rule. Useful since distributions often have a log-normal shape (e.g., spending) |
ignore_zero |
If TRUE, will exclude zero values from IQR & flagging. Note that zeroes will automatically be ignored if apply_log = TRUE |
outlier_tukey_top
: get the largest non-outlier value for top-coding
Other functions for identifying outliers:
outlier_mean_compare()
,
outlier_pct()
,
outlier_plot()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | library(dplyr)
data(svy)
# take a look at the days variable
outlier_plot(svy$act, days, act)
outlier_plot(svy$act, days, act, apply_log = TRUE)
activity <- group_by(svy$act, act) %>% mutate(
is_outlier = outlier_tukey(days, ignore_zero = TRUE, apply_log = TRUE),
# in case we want to topcode the outliers:
topcode_value = outlier_tukey_top(days, apply_log = TRUE),
days_cleaned = ifelse(is_outlier, NA, days)
) %>% ungroup()
# summarize
outlier_plot(activity, days, act, apply_log = TRUE, show_outliers = TRUE)
outlier_pct(activity, act)
outlier_mean_compare(activity, days, days_cleaned, act)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.