In sadamov/trapcomparison: What is really in the air? An evaluation of multiple automatic pollen monitors

Setup

This project is using renv dependency management, for more info: https://cran.r-project.org/web/packages/renv/vignettes/renv.html The .RProfile is optimized for an interactive session in VSCode.

knitr::opts_chunk$set(
  echo = FALSE,
  error = FALSE,
  warning = FALSE,
  message = FALSE,
  fig.retina = 3,
  fig.width = 10,
  fig.height = 7,
  out.width = "100%",
  out.height = "100%"
)

library(caTools)
library(MASS)
library(tidyverse)
library(magrittr)
library(lubridate)
library(ggpubr)
library(here)
library(kableExtra)
library(RColorBrewer)
library(padr)
library(psych)
library(nparcomp)
library(conflicted)
conflict_prefer("select", "dplyr")
conflict_prefer("filter", "dplyr")
conflict_prefer("extract", "magrittr")

Data Import

The data was prepared by Fiona Tummon (main author of paper) and stored in the /data-raw Folder. This first step of data preparation is not documented here. Nine different Pollen Traps were situated on the roof in Payerne and continuously measured pollen grains between 19.04.2019 - 31.05.2019. For the sake of simplicity we only look at concentrations of total pollen (i.e. the sum of all pollen taxa). The concentrations were temporally averaged to obtain three different timeseries:

Hourly averages
6-hour averages (hourly values from 00:00 to 05:00, 06:00 to 11:00, ...)
Daily averages (hourly values from 00:00 to 23:00)

The preprocessed data is available to the reader in the /data folder. In /data-raw/preproc.R the preprocessing is documented. The data has missing values in the Hirst2, Poleno(1 and 4) and Rapid-E timeseries. For Poleno1 calibration was carried out on 11 days. For Poleno4 calibration was carried out on 6 days. For Rapid-E software malfunctioned (it seems) during 9 days, as discussed with MDS (Meteoswiss - measurements and data). If there was too much missing data present in a timeseries, all timeseries were set to NA for this period. This allows for a fair comparison between traps, only when all of them measured. For hourly values, every hour was investigated by itself. For sixhour and daily averages, "too much missing data" was defined as more than 50% of the hourly values in the period. The whole analysis is based on the data excluding such windows with too much missing data, except for the timeseries plot (figure 1). For the timeseries the full timeseries are displayed, which includes NAs for Poleno; Hirst and Rapid-E.

The following eight traps are being compared:

Hirst = Mean(Hirst 1, Hirst 2) / 1.35 which will be the standard to compare against. Historically, we have the most experience with Hirst traps.
BAA-500 (no hourly values)
KH-3000-A
KH-3000-B
Poleno-1
Poleno-4
Rapid-E
WIBS-NEO

# The data without calibration days
load(paste0(here(), "/data/pollen.rda"))
# The data with calibration days for the timeseries plot
load(paste0(here(), "/data/pollen_full.rda"))
load(paste0(here(), "/data/pollen_full_with_hirst.rda"))
aggregation <- c("daily", "sixhour", "hourly")
map(aggregation, ~ pollen %>% extract2(.x))

Comparison

Some Tables

The table shows the percentage of measurements still available in the underlying dataset after:

Removal of missing data if less than half of the measurements were available.
Additional Removal of time steps where the Hirst-Average of both traps is below 10 grains/m3.

missing <- map2(
  pollen, pollen_full,
  ~ round(
    .x %>%
      pull(datetime) %>%
      unique() %>%
      length() /
    .y %>%
      pull(datetime) %>%
      unique() %>%
      length(),
    4
  ) %>%
    scales::percent(accuracy = 0.1)
)

pollen_low <- map(pollen, ~.x %>%
    filter(trap == "Hirst") %>%
    mutate(hirst_low = if_else(conc <= 10, TRUE, FALSE)))

lower_10 <- map2(
  pollen_low,
  pollen_full,
  ~ round(
    .x %>%
      filter(!hirst_low) %>%
      pull(datetime) %>%
      unique() %>%
      length() /
    .y %>%
      pull(datetime) %>%
      unique() %>%
      length(),
    4
  ) %>%
    scales::percent(accuracy = 0.1)
)

kable_data_exclusion <- as_tibble(missing) %>%
  bind_rows(as_tibble(lower_10)) %>%
  bind_cols(tibble(Comment = c("After removal of missing data", "After removal of missing data or Hirst-mean <= 10 grains/m3"))) %>%
  kable(escape = FALSE) %>%
  kable_styling(c("striped", "condensed"),
                full_width = FALSE,
                font_size = 20)

save_kable(kable_data_exclusion, file = paste0(here(), "/tables/table0_kable_data_exclusion.html"))
kable_data_exclusion

The number of traps measuring pollen simultaneously decreases with higher temporal resolution. This table shows the number of traps measuring 0 pollen (note that missing values were already excluded).

number_of_measurements <- map(pollen, ~ .x %>%
  group_by(trap, date, hour) %>%
  mutate(ntraps = as.integer(as.integer(conc) != 0)) %>%
  ungroup() %>%
  group_by(date, hour) %>%
  summarise(ntraps = sum(ntraps)) %>%
  ungroup() %>%
  mutate(ntraps = factor(ntraps)))

timesteps <- map(number_of_measurements, ~ .x %>% nrow())

kable_ntraps <- pmap(
  list(number_of_measurements, aggregation, c(8, 8, 7)),
  function(first, second, third) {
    first %>%
      count(ntraps) %>%
      mutate(
        freq = scales::percent(n / nrow(first), accuracy = 0.1),
        res = second,
        tottraps = paste0(ntraps, " / ", third)
      )
  }
) %>%
  bind_rows() %>%
  select(
    "Temporal Resolution" = res,
    "Number of Traps Measuring" = tottraps,
    "Frequency of Occurence" = freq
  ) %>%
  kable(escape = FALSE) %>%
  kable_styling(c("striped", "condensed"),
                full_width = FALSE,
                font_size = 20) %>%
  column_spec(1, border_right = TRUE) %>%
  collapse_rows(columns = 1) %>%
  pack_rows(NULL, 1, 1, indent = FALSE) %>%
  pack_rows(NULL, 2, 5, indent = FALSE) %>%
  pack_rows(NULL, 6, 10, indent = FALSE)

save_kable(kable_ntraps, file = paste0(here(), "/tables/table1_ntraps.html"))
kable_ntraps

Several metrics were calculated: the frequency of occurrence was obtained by counting the number of days on which pollen were detected; the average was calculated by averaging values for total pollen measurements during the study period; and the Seasonal Pollen Integral (SPI) was calculated by integrating the concentrations of total pollen over the study period. (Mandrioli et al., 1998).

These metrics are well-known and a standard part of any Pollen Measurement study. It is common to investigate individual pollen species during their blooming period. Here we are calculating them for total pollen, which might not be as meaningful, but can function as a crude mean for comparison.

pol_occurence <- pollen$daily %>%
  group_by(trap) %>%
  summarise(occurence = sum(conc != 0)) %>%
  ungroup()

pol_maximum <- pollen$daily %>%
  group_by(trap) %>%
  summarise(maximum = max(conc)) %>%
  ungroup()

pol_average <- pollen$daily %>%
  group_by(trap) %>%
  summarise(average = mean(conc)) %>%
  ungroup()

pol_median <- pollen$daily %>%
  group_by(trap) %>%
  summarise(median = median(conc)) %>%
  ungroup()

pol_spi <- pollen$daily %>% # Seasonal Pollen Integral
  group_by(trap) %>%
  summarise(spi = sum(conc)) %>%
  ungroup()

kable_metrics <- pol_occurence %>%
  inner_join(pol_maximum, by = "trap") %>%
  inner_join(pol_average, by = "trap") %>%
  inner_join(pol_median, by = "trap") %>%
  inner_join(pol_spi, by = "trap") %>%
  mutate(across(where(is.numeric), round)) %>%
  kable(escape = FALSE) %>%
  kable_styling(c("striped", "condensed"),
                full_width = FALSE,
                font_size = 20) %>%
  column_spec(1, italic = TRUE, border_right = TRUE)

save_kable(kable_metrics, file = paste0(here(), "/tables/table2_metrics.html"))
kable_metrics

We can see that already for daily values the differences between traps are substantial. Next, we investigate how large the spread within the measurements is for each trap for different temporal resolutions. We can see that not only the absolute measurements but also the spread of the measurements varies largely between traps.

kable_variability <- map(
  pollen, ~ .x %>%
    group_by(trap) %>%
    summarise(
      sd = sd(conc),
      mean = mean(conc),
      se = sd / n(),
      cv = sd / mean
    ) %>%
    mutate(across(c(sd, mean), round)) %>%
    mutate(across(c(se, cv), round, 2))
) %>%
  reduce(full_join, by = "trap") %>%
  mutate(across(everything(), replace_na, "-")) %>%
  setNames(c("Trap", rep(c("sd", "mean", "se", "cv"), times = 3))) %>%
  kable(escape = FALSE) %>%
  kable_styling(c("striped", "condensed"),
    full_width = FALSE,
    font_size = 20
  ) %>%
  add_header_above(c(
    " " = 1, "Daily Averages" = 4,
    "Six-Hour Averages" = 4,
    "Hourly Averages" = 4
  )) %>%
  column_spec(1, italic = TRUE, border_right = TRUE) %>%
  column_spec(c(5, 9, 13), border_right = TRUE)

save_kable(kable_variability,
  file = paste0(here(), "/tables/table3_variability.html")
)
kable_variability

Setting a coarser temporal resolution reduces the variability in the measurements. The coefficient of variation (CV) for the Polenos is the largest in the study. On the other side of the spectrum, RapidE has the lowest spread in the measurements.

Some Plots

theme_set(theme_minimal(base_size = 14) + theme(legend.title = element_blank()))
traps_names <- levels(pollen$daily$trap)
traps_cols <- c("#252424", "#E5C494", "#8DA0CB", "#4a5f92", "#e98962", "#c25a30", "#FFD92F", "#66C2A5")
names(traps_cols) <- traps_names
traps_lty <- c(1, 1, 1, 3, 1, 3, 1, 1)
names(traps_lty) <- traps_names

# For the timeseries plot we need one more color and lty (both hirst traps plotted)
traps_cols_hirst <- c("#353333", "#252424", "#E5C494", "#8DA0CB", "#4a5f92", "#e98962", "#c25a30", "#FFD92F", "#66C2A5")
traps_names_hirst <- levels(pollen_full_with_hirst$daily$trap)
names(traps_cols_hirst) <- traps_names_hirst
traps_lty_hirst <- c(1, 3, 1, 1, 3, 1, 3, 1, 1)
names(traps_lty_hirst) <- traps_names_hirst

Timeseries

The differences between the traps are quite substantial looking at these timeseries. Most traps were able to identify days with higher pollen occurrence, but the performance varies a lot (maybe also species dependent). The concentrations are scaled here to match Hirst (conc / mean(trap_i) * mean(trap_hirst)). Where the mean for each trap was calculated across all days of the study. For this timeseries plot both Hirst devices are plotted separately

pmap(
  list(
    pollen_full_with_hirst,
    aggregation,
    c("2019-05-31", "2019-05-01", "2019-04-25")
  ),
  function(first, second, third) {
    gg_time1 <- first %>%
      group_by(trap) %>%
      mutate(mean = mean(conc, na.rm = TRUE)) %>%
      ungroup() %>%
      left_join(
        first %>%
          filter(trap %in% c("Hirst1", "Hirst2")) %>%
          group_by(datetime) %>%
          mutate(Hirst = mean(conc, na.rm = TRUE)) %>%
          ungroup() %>%
          mutate(mean_hirst = mean(Hirst)) %>%
          select(datetime, mean_hirst),
        by = "datetime"
      ) %>%
      mutate(conc = conc / mean * mean_hirst) %>%
      pad(
        start_val = min(first %>% pull(datetime)),
        end_val = max(first %>% pull(datetime)),
        group = c("trap"),
        by = "datetime"
      ) %>%
      filter(date < date(third)) %>%
      ggplot(aes(x = datetime)) +
      geom_line(aes(y = conc, col = trap, lty = trap), size = 1, alpha = 0.6) +
      labs(y = "Mean Conc. [Pollen/m³]", x = "") +
      ggtitle(paste0(
        "Timeseries of ",
        tools::toTitleCase(second),
        " Total Pollen Concentrations"
      )) +
      scale_color_manual(values = traps_cols_hirst) +
      scale_linetype_manual(values = traps_lty_hirst)

    ggsave(paste0(here(), "/figures/fig1_timeseries_", second, ".png"),
      gg_time1,
      width = 12, height = 9, dpi = 300
    )
    gg_time1
  }
)

Looking at the boxplots we can see how the spread in the measurements increases with higher temporal resolution; especially for Hirst. Wibs, Rapide and KHB measured less pollen than the other traps, overall.

map2(pollen, aggregation, function(first, second) {
  gg_box1 <- first %>%
    ggplot() +
    geom_boxplot(aes(y = log10(conc + 1), fill = trap, lty = trap), alpha = 0.8) +
    labs(y = "Log Mean Conc. [Pollen/m³]", x = "") +
    scale_fill_manual(values = traps_cols) +
    scale_linetype_manual(values = traps_lty) +
    ggtitle(paste0(
      "Boxplot of ",
      tools::toTitleCase(second),
      " Total Pollen Concentrations"
    )) +
    theme(legend.position = "bottom") +
    theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
    coord_cartesian(ylim = c(0, 3))

  ggsave(paste0(here(), "/figures/fig2_boxplot_", second, ".png"),
    gg_box1,
    width = 12, height = 9, dpi = 300
  )
  gg_box1
})

Looking at the histograms, the findings above are solidified. Furthermore, we clearly see the discrete timeseries for higher resolutions as produced by Hirst Traps and Manual Counting.

label_xy <- c(6, 20, 250)

pmap(list(pollen, aggregation, label_xy), function(first, second, third) {
  sd_comp <- first %>%
    group_by(trap) %>%
    summarise(sd = sd(conc))

  gg_hist1 <- first %>%
    ggplot() +
    geom_histogram(aes(y = log10(conc + 1), fill = trap),
      alpha = 0.8, binwidth = 0.1
    ) +
    geom_label(
      data = sd_comp,
      aes(
        label = paste(
          "Standard Deviation:\n",
          round(sd),
          "Pollen / m³"
        ),
        x = third,
        y = 1, group = trap
      ), size = 3
    ) +
    facet_wrap(vars(trap), ncol = 3) +
    theme(legend.position = "bottom") +
    coord_flip() +
    labs(
      x = "Occurence of Pollen Concentrations",
      y = "Log Mean Conc. [Pollen/m³]"
    ) +
    ggtitle(paste0(
      "Histogram of ",
      tools::toTitleCase(second),
      " Total Pollen Concentrations"
    )) +
    scale_fill_manual(values = traps_cols)

  ggsave(paste0(here(), "/figures/fig3_histogram_", second, ".png"),
    gg_hist1,
    width = 12, height = 9, dpi = 300
  )
  gg_hist1
})

Relative Errors

Looking at absolute relative differences between Hirst and the other traps, one can investigate which traps diverge the most from the historically used Hirst traps. The general findings seem to persist across different temporal resolutions. It is common to exclude measurements below a 10 pollen grains per m^3 threshold, because of the large uncertainty of these low concentration measurements.

conc_groups <- c(
    "Group10_20",
    "Group20_50",
    "Group50_100",
    "Group100_300",
    "Group300"
  )

errors <- map(pollen, ~ .x %>%
  # Here we previously used the common mean of all traps,
  # but now decided to compare to Hirst
  pivot_wider(names_from = trap, values_from = conc) %>%
  # Divsion by zero otherwise leads to many NAs, especially for hourly values
  mutate(mean = Hirst)) %>%
  map_at(3, ~ .x %>%
    mutate(
      hund = NA_real_
    )) %>%
  map(~ .x %>%
    select(datetime, date, hour, mean, any_of(traps_names)) %>%
    pivot_longer(any_of(traps_names), names_to = "trap", values_to = "conc") %>%
    mutate(reldiff = abs(conc - mean) / mean,
           trap = factor(trap, levels = traps_names)) %>%
    filter(mean > 10))

errors_conc <- map(errors, ~ .x %>%
  mutate(
    group = case_when(
      mean >= 10 & mean < 20 ~ "Group10_20",
      mean >= 20 & mean < 50 ~ "Group20_50",
      mean >= 50 & mean < 100 ~ "Group50_100",
      mean >= 100 & mean < 300 ~ "Group100_300",
      mean >= 300 ~ "Group300"
    ),
    group = factor(group, levels = conc_groups)))

The relative differences from the Hirst mean are substantial, Hund and Poleno measure more pollen than Hirst; KHB, Wibs and Rapide measure less. For higher temporal Resolutions the differences become larger. Please note that some outliers (black dots) were cut off, reaching up to almost 5000%. This does not mean that they measure worse than the rest. There is no golden standard in measuring pollen.

map2(errors, aggregation, function(first, second) {
  gg_errors1 <- first %>%
    filter(trap != "Hirst") %>%
    ggplot(aes(y = reldiff, fill = trap, lty = trap)) +
    geom_boxplot() +
    labs(y = "% Relative Difference from Hirst: |Trap\u1d62 - Hirst| / Hirst", x = "") +
    scale_fill_manual(values = traps_cols) +
    scale_linetype_manual(values = traps_lty) +
    scale_y_continuous(labels=scales::percent) +
    ggtitle(paste0(
      "Boxplot of ",
      tools::toTitleCase(second),
      " Measurement Differences from Hirst"
    )) +
    theme(legend.position = "bottom") +
    theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
    coord_cartesian(ylim = c(0, 10))

  ggsave(paste0(here(), "/figures/fig4_relerror_boxplot_", second, ".png"),
    gg_errors1,
    width = 12, height = 9, dpi = 300
  )
  gg_errors1
})

The spread in the measurement seems to correlate with the total pollen concentrations. For lower concentrations Poleno and Hund are measuring more then the Hirst (better capture?). KHB, Wibs and Rapide are consistently measuring less pollen than the other traps in the study. Be careful, these are log scales, as the differences can become very large. A value of 1 on the y axis reflects a 10x larger value for that trap compared to Hirst.

map2(errors_conc, aggregation, function(first, second) {
  gg_conc1 <- first %>%
  filter(trap != "Hirst") %>%
    ggplot(aes(y = reldiff, fill = trap, lty = trap)) +
    geom_boxplot() +
    labs(y = "% Relative Difference from Hirst: |Trap\u1d62 - Hirst| / Hirst", x = "") +
    scale_fill_manual(values = traps_cols) +
    scale_linetype_manual(values = traps_lty) +
    ggtitle(paste0(
      "Boxplot of Relative ",
      tools::toTitleCase(second),
      " Measurement Differences"
    )) +
    theme(legend.position = "bottom") +
    annotation_logticks(sides = "l") +
    scale_y_continuous(trans = 'log10',
                       breaks = scales::trans_breaks('log10', function(x) 10^x),
                       labels = scales::label_percent(accuracy = 1, big.mark = "'"),
                       limits = c(0.01, 100)) +
    theme(axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
    facet_wrap(~group)

  ggsave(paste0(here(), "/figures/fig5_relerror_conc_boxplot_", second, ".png"),
    gg_conc1,
    width = 12, height = 9, dpi = 300
  )
  gg_conc1
})

The table shows the relative differences from Hirst for all traps combined.

kable_error <- map(errors_conc, ~.x %>%
  na.omit() %>% # We introduced NAs for hund for nicer plots above.
  group_by(group) %>%
  summarise(
    mean = mean(reldiff),
    sd = sd(reldiff),
    q25 = quantile(reldiff, 0.25),
    median = median(reldiff),
    q75 = quantile(reldiff, 0.75)
  ) %>%
  ungroup %>%
  bind_rows(
    .x %>%
    na.omit() %>%
      summarise(
        mean = mean(reldiff),
        sd = sd(reldiff),
        q25 = quantile(reldiff, 0.25),
        median = median(reldiff),
        q75 = quantile(reldiff, 0.75)
      ) %>%
      mutate(group = "all") %>%
      ungroup())) %>%
  bind_rows() %>%
  mutate(across(!group, round, 2)) %>%
    kable(escape = FALSE) %>%
  kable_styling(c("striped", "condensed"),
    full_width = FALSE,
    font_size = 20
  ) %>%
  pack_rows("Daily", 1, 5, indent = FALSE) %>%
  pack_rows("Six-Hour", 6, 11, indent = FALSE) %>%
  pack_rows("Hourly", 12, 17, indent = FALSE) %>%
  add_header_above(c("Relative Differences for All Traps Combined" = 6))

save_kable(kable_error,
  file = paste0(here(), "/tables/table4_relerror.html")
)

kable_error

The density plots outline the behavior of all traps for different concentration categories (based on the Hirst mean). KHB and Rapide stand out, usually measuring less than the others for concentrations below 300. Poleno further stand out with more pollen measured during periods of low concentrations.

map2(errors_conc, aggregation, function(data, second) {
  gg_conc_dens <- map2(
    conc_groups,
    c(75, 200, 200, 300, 1000),
    function(first, second) {
      conc_plot_data <- data %>%
        filter(group == first)

      if (nrow(conc_plot_data) > 0) {
        conc_plot_data %>%
          ggplot() +
          geom_density(aes(x = conc, col = trap, fill = trap, lty = trap), alpha = 0.1) +
          scale_fill_manual(
            values = traps_cols,
            aesthetics = c("colour", "fill")
          ) +
          scale_linetype_manual(values = traps_lty) +
          ggtitle(first) +
          theme(legend.position = "bottom") +
          coord_cartesian(xlim = c(0, second))
      }
    }
  )
  gg_density1 <- ggarrange(
    plotlist = gg_conc_dens
  ) %>%
    annotate_figure(
      top = text_grob(paste(
        "Density Plot of",
        second,
        "Total Pollen Measurements"
      ),
      size = 20
      )
    )

  ggsave(paste0(here(), "/figures/fig6_conc_density_", second, ".png"),
    gg_density1,
    width = 12, height = 9, dpi = 300
  )
  gg_density1
})

Resdiual Analysis

Pollen data usually does not fulfil the assumption of ANOVA. We want to check this in the following.

are the errors normally distributed?

In a QQ-plot we plot the empirical quantiles (“what we see in the data”) vs. the theoretical quantiles (“what we expect from the model”). The plot should show a more or less straight line if the distributional assumption is correct. By default, a standard normal distribution is the theoretical “reference distribution”.

pollen_10 <- map(pollen, ~ .x %>%
  pivot_wider(names_from = trap, values_from = conc) %>%
  filter(Hirst > 10) %>%
  pivot_longer(any_of(traps_names), names_to = "trap", values_to = "conc") %>%
  mutate(trap = factor(trap),
         trap = relevel(trap, ref = "Hirst")))

pollen_10_log <- map(pollen, ~ .x %>% mutate(conc = log(conc + 1)))

The default contrast is contr treatment and with relevel we made sure that Hirst is the reference level.

map2(pollen_10, pollen_10_log, function(first, second) {
  fit_anova <- aov(conc ~ trap,
    data = first)

  fit_anova_log <- aov(conc ~ trap,
    data = second
  )

  gg_res1 <- tibble(residuals = residuals(fit_anova, type = "pearson")) %>%
    ggplot(aes(sample = residuals)) +
    stat_qq(col = traps_cols[2]) +
    stat_qq_line(col = traps_cols[1])

  gg_res2 <- tibble(residuals = residuals(fit_anova_log, type = "pearson")) %>%
    ggplot(aes(sample = residuals)) +
    stat_qq(col = traps_cols[2]) +
    stat_qq_line(col = traps_cols[1])

  ggarrange(gg_res1, gg_res2, nrow = 1) %>%
    annotate_figure(top = paste(
      "QQ-Plot for the ANOVA Residuals With",
      "(right) and Without Logarithmizing"
    ))
})

is the error variance constant?
do the errors have mean zero?

The Tukey-Anscombe plot plots the residuals vs. the fitted values. It allows us to check whether the residuals have constant variance and whether the residuals have mean zero (i.e. they don’t show any deterministic pattern).

map2(pollen_10, pollen_10_log, function(first, second) {
  fit_anova <- aov(conc ~ trap,
    data = first
  )

  fit_anova_log <- aov(conc ~ trap,
    data = second
  )

  gg_tukey1 <- tibble(
    resid = residuals(fit_anova,
      type = "pearson"
    ),
    fitted = fit_anova$fitted.values
  ) %>%
    ggplot(aes(x = fitted, y = resid)) +
    geom_point(
      alpha = 0.5,
      position = position_jitter(
        width = 5,
        height = 0
      ),
      col = traps_cols[2]
    ) +
    # geom_smooth(method = "loess", col = traps_cols[4]) +
    geom_abline(slope = 0, intercept = 0, col = traps_cols[1], alpha = 0.9) +
    coord_cartesian(ylim = c(-500, 500))

  gg_tukey2 <-
    tibble(
      resid = residuals(fit_anova_log,
        type = "pearson"
      ),
      fitted = fit_anova_log$fitted.values
    ) %>%
    ggplot(aes(x = fitted, y = resid)) +
    geom_point(
      alpha = 0.5,
      position = position_jitter(width = 0.02, height = 0),
      col = traps_cols[2]
    ) +
    geom_abline(slope = 0, intercept = 0, col = traps_cols[1], alpha = 0.9) +
    coord_cartesian(ylim = c(-3, 3))


  ggarrange(gg_tukey1, gg_tukey2) %>%
    annotate_figure(top = paste(
      "Tukey Anscombe - Plot for the ANOVA Residuals",
      "With (right) and Without Logarithmizing"
    ))
})

are the errors independent?

If the data has some serial structure (i.e., if observations were recorded in a certain time order), we typically want to check whether residuals close in time are more similar than residuals far apart, as this would be a violation of the independence assumption. We can do so by using a so-called index plot where we plot the residuals against time. For positively dependent residuals we would see time periods where most residuals have the same sign, while for negatively dependent residuals, the residuals would “jump” too often from positive to negative compared to independent residuals.

map2(pollen_10, pollen_10_log, function(first, second) {
  fit_anova <- aov(conc ~ trap,
    data = first
  )
  fit_anova_log <- aov(conc ~ trap,
    data = second
  )
  resid <- residuals(fit_anova, type = "pearson")
  resid_df <- tibble(resid = resid, id = as.numeric(names(resid)))

  gg_timeline1 <- tibble(
    id = seq_len(nrow(first)),
    time = first$datetime,
    trap = first$trap
  ) %>%
    left_join(resid_df, by = "id") %>%
    ggplot(aes(x = time, y = resid)) +
    geom_point(aes(col = trap)) +
    geom_line(aes(col = trap), alpha = 0.3) +
    scale_color_manual(values = traps_cols) +
    coord_cartesian(ylim = c(0, 1000))

  resid_log <- residuals(fit_anova_log, type = "pearson")
  resid_df_log <- tibble(resid = resid_log, id = as.numeric(names(resid_log)))

  gg_timeline2 <- tibble(
    id = seq_len(nrow(second)),
    time = second$datetime,
    trap = second$trap
  ) %>%
    left_join(resid_df_log, by = "id") %>%
    ggplot(aes(x = time, y = resid)) +
    geom_point(aes(col = trap)) +
    geom_line(aes(col = trap), alpha = 0.3) +
    scale_color_manual(values = traps_cols) +
    coord_cartesian(ylim = c(0, 8))

  ggarrange(gg_timeline1, gg_timeline2, nrow = 2) %>%
    annotate_figure(top = paste("Index-Plot for the ANOVA Residuals",
                                 "With (bottom) and Without Logarithmizing"))
})

Especially for higher temporal resolution none of the assumption are fulfilled and we should clearly use robust methods to analyse this dataset.

Correlation

We are using the robust method from Spearman, comparing Rho. The Spearman correlation coefficient is defined as the Pearson correlation coefficient between the rank variables. https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient https://stats.stackexchange.com/questions/3943/kendall-tau-or-spearmans-rho

data_corr <- map(pollen_10, ~ .x %>%
  select(conc, trap, datetime) %>%
  pivot_wider(names_from = trap, values_from = conc, datetime))

corr_matrix <- map(data_corr, ~ corr.test(
  .x %>% select(-datetime),
  use = "complete",
  method = "spearman", # Requested by Reviewer, I do not agree as normality assumptions are not fulfiled
  adjust = "holm",
  alpha = .05,
  ci = TRUE,
  minlength = 5
) %>%
  extract2(1))

corr_tb <- map(corr_matrix, ~ .x %>%
  as_tibble() %>%
  mutate(trap = rownames(.x))) %>%
  map_at(c(2, 3), ~ .x %>%
    mutate(
      hund = NA_real_
    )) %>%
  map(~ .x %>%
    select(trap, any_of(traps_names)) %>%
    arrange(trap))

kable_corr <-  corr_tb %>%
  bind_rows() %>%
  mutate(across(where(is.numeric), round, 2)) %>%
  mutate(across("BAA-500", replace_na, "-")) %>%
  setNames(c("", traps_names)) %>%
  kable(escape = FALSE) %>%
  kable_styling(c("striped", "condensed"),
    full_width = FALSE,
    font_size = 20
  ) %>%
  pack_rows("Daily", 1, 8, indent = FALSE) %>%
  pack_rows("Six-Hour", 9, 16, indent = FALSE) %>%
  pack_rows("Hourly", 17, 22, indent = FALSE) %>%
  add_header_above(c("Pearson R for Different Temporal Resolutions" = 9))

save_kable(kable_corr,
  file = paste0(here(), "/tables/table5_correlation_spearman.html")
)
kable_corr

Kruskal-Wallis Test / Omnibus Test

Kruskal-Wallis test by rank is a non-parametric alternative to one-way ANOVA test, which extends the two-samples Wilcoxon test in the situation where there are more than two groups. It’s recommended when the assumptions of one-way ANOVA test are not met. This tutorial describes how to compute Kruskal-Wallis test in R software. (http://www.sthda.com/english/wiki/kruskal-wallis-test-in-r)

Assumptions The assumptions of the Kruskal-Wallis test are similar to those for the Wilcoxon-Mann-Whitney test.

Samples are random samples, or allocation to treatment group is random.
The two samples are mutually independent.
The measurement scale is at least ordinal, and the variable is continuous.
If the test is used as a test of dominance, it has no distributional assumptions. If it used to compare medians, the distributions must be similar apart from their locations.

The test is generally considered to be robust to ties. However, if ties are present they should not be concentrated together in one part of the distribution (they should have either a normal or uniform distribution) https://influentialpoints.com/Training/kruskal-wallis_anova-principles-properties-assumptions.htm

map(pollen_10_log, function(first) {
  kruskal.test(conc ~ trap,
    data = first %>%
      mutate(trap = factor(trap))
  )
})

The Null-Hypothesis that the median of all traps are equal can clearly be rejected.

Robust Contrasts with Confidence Intervals

https://www.researchgate.net/publication/282206980_nparcomp_An_R_Software_Package_for_Nonparametric_Multiple_Comparisons_and_Simultaneous_Confidence_Intervals The R package nparcomp implements a broad range of rank-based nonparametric methods for multiple comparisons. The single step procedures provide local test decisions in terms of multiplicity adjusted p-values and simultaneous conﬁdence intervals. The null hypothesis H0: p = 1/2 is significantly rejected at 5% level of significance for many pairwise comparisons. Whenever the p-Value is < than 5% = the confidence interval contains 0.5 -> the effect from the factor trap is not statistically meaningful. The Estimator can also be interpreted as a proxy for the relative difference in median for two traps. If the Estimator is > 0.5 then the second trap tends to have larger measurements.

kable_nparcomp <- map2(pollen_10_log, aggregation, function(first, second) {
  npar_contr <- nparcomp::nparcomp(
    conc ~ trap,
    first,
    conf.level = 0.95,
    alternative = "two.sided",
    type = "Dunnet",
    control = "Hirst"
  )

  title <- paste(
    "Robust Contrasts and Confidence Intervals for",
    second, "Measurements"
  )
  myheader <- c(title = 5)
  names(myheader) <- title

  npar_contr %>%
    extract2("Analysis") %>%
    mutate(across(where(is.numeric), round, 3)) %>%
    select(Traps = Comparison, Estimator, Lower, Upper, pValue = p.Value) %>%
    mutate(pValue = if_else(pValue < 0.05,
      cell_spec(pValue, color = "red"),
      cell_spec(pValue)
    )) %>%
    kable(escape = FALSE) %>%
    kable_styling("striped", full_width = FALSE) %>%
    add_header_above(myheader)
})

save_kable(kable_nparcomp,
  file = paste0(
    here(),
    "/tables/table6_nparcomp.html"
  )
)

kable_nparcomp