logger$info("Entering outlier analysis section")



Outlier analysis

For each location_code, adjusted boxplots are given of the total count for outliers detection in the period r str_c(DATE_FROM, " to ", DATE_TO). Outliers are given as dots (if any) in the adjusted box-and-whisker plots below. Note that outliers are not necessarily errors.



d <- d_ltr %>%
    filter(type_name == "TC") %>%
    select(location_code, date, tc = count)
d %>%
    ggplot(mapping = aes(x = location_code, y = tc), alpha = 0.5) +
        stat_adj_boxplot() +
        stat_adj_boxplot_outlier() +
        scale_x_discrete(name = "") +
        scale_y_continuous(name = "total count") +
        coord_flip()
d <- d %>%
    group_by(location_code) %>%
    summarise(n = n(), stats = list(adj_boxplot_stats(tc)), .groups = "drop") %>%
    split(1:nrow(.)) %>%
    map_df(function(x) {
        d %>%
            mutate(n = x %>% chuck("n")) %>%
            filter(location_code == x %>% chuck("location_code")) %>%
            filter(
                (tc < (x %>% 
                    pluck("stats") %>% 
                    unlist %>% 
                    chuck("ymin"))) |
                (tc > (x %>% 
                    pluck("stats") %>% 
                    unlist %>% 
                    chuck("ymax"))))
    })

Outliers, if available, are listed in the table below. In addition, also the number of surveys n is reported. Litter experts should decide if outliers are errors and need to be excluded from analysis. Note, however, that due to its non-parametric nature, litteR is fairly robust for outliers.



d %>%
    arrange(location_code, tc) %>%
    select(location_code, date, n, tc) %>%
    rename("total count" = tc) %>%
    kable()


Try the litteR package in your browser

Any scripts or data that you put into this service are public.

litteR documentation built on Aug. 27, 2022, 1:05 a.m.