Sampling recordings - Multple Time Periods"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This brief vignette shows an example of a basic workflow selecting recordings for different times of day by site and year.

First we'll load the packages we want to work with

#| message: false
library(ARUtools)
library(dplyr)
library(purrr)
library(tidyr)
library(glue)
library(lubridate)

Next we'll prepare our metadata on the recordings, by cleaning, adding site-level information and calculating the time to sunrise/sunset for each file. We'll also define recordings as either 'early' (occurring before 6am) or 'late' (occurring after 6am).

s <- clean_site_index(example_sites_clean,
  name_date = c("date_time_start", "date_time_end")
)
m <- clean_metadata(project_files = example_files) |>
  add_sites(s) |>
  calc_sun() |>
  mutate(
    time_period = if_else(hour(date_time) < 6, "early", "late"),
    year = year(date)
  )
m

Time to do some sampling!

First we define the selection parameters for each time frame we're interested in sampling. This might be "dawn" and "dusk", or in this example, "early" and "late" morning.

This function will also simulate the selection weights so we can see what we've defined.

#| fig-width: 12
#| fig-asp: 0.7
#| out-width: 80%
p <- list(
  "early" = sim_selection_weights(min_range = c(-70, 240)),
  "late" = sim_selection_weights(min_range = c(100, 300), min_mean = 200)
)
p

Now we can calculate selection weights

Here we'll calculate a separate set of selection weights for early and late recordings in each year. Then we'll group recordings by site, year, and time period.

w <- m |>
  nest(data = c(-time_period, -year)) |>
  mutate(
    params = p,
    sel = map2(data, params, calc_selection_weights)
  ) |>
  unnest(sel) |>
  select(-"data", -"params") |>
  mutate(selection_group = glue("{site_id}_{year}_{time_period}"))
w

This w data sets contains the original sampling recordings, but now also new columns containing various measures of the probability of selection.

We'll define the number of samples we'd like to have.

n <- w |>
  summarize(n_recordings = n(), .by = c("selection_group", "time_period")) |>
  mutate(
    n = if_else(time_period == "early", 5, 2),
    n_os = if_else(time_period == "early", floor(n * 1 / 3), floor(n * 1 / 4)),
    n_os = pmax(0, pmin(n_recordings - n, round(n / 3))),
    n = pmin(n, n_recordings)
  )
n

And finally sample the recordings!

g <- sample_recordings(w, n,
  col_site_id = selection_group,
  col_sel_weights = psel_normalized
)
g

The recordings selected for sampling...

g$sites_base


Try the ARUtools package in your browser

Any scripts or data that you put into this service are public.

ARUtools documentation built on Oct. 9, 2024, 1:07 a.m.