suppressWarnings(library("tidyverse"))
library(ISUmonarch)

Bee data

Here is the bee data.frame in the simplest form.

pander::pandoc.table(bee %>%
    head(.))

If we wanted to get a count by species, round and siteID, we could get it like this:

pander::pandoc.table(
  bee %>%
  group_by(`Bee Species`, round, siteID) %>%
  summarize(count = sum(count)) %>%
    head(.)
  )

If we wanted to find the average count per species per siteID, we could do this:

pander::pandoc.table(
  bee %>%
  group_by(`Bee Species`, round, siteID) %>%
  summarize(count = sum(count)) %>%
  group_by(`Bee Species`, siteID) %>%
  summarize(mean_count = mean(count)) %>%
    head(.)
  )

While this table may look fine, it gives incorrect results. This is because the rounds where no data is recorded is missing. To correct this, we need to add back the rounds and surveys with 0 counts.

pander::pandoc.table(
  bee %>%
  mutate(round = as.factor(round)) %>%
  full_join(survey %>% left_join(transect) %>% select("siteID","year"), by=c("siteID","year")) %>%
  complete(`Bee Species`, siteID, round, fill = list(count = 0)) %>%
  group_by(`Bee Species`, round, siteID) %>%
  summarize(count = sum(count)) %>%
  group_by(`Bee Species`, siteID) %>%
  summarize(mean_count = mean(count)) %>%
    head(.)
  )

The key was using the complete function and linking the survey and transect datasets to include sites that didn't record anything.



jarad/ISUmonarch documentation built on Aug. 10, 2022, 1:09 p.m.