knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This brief vignette shows an example of a basic workflow selecting recordings for different times of day by site and year.
First we'll load the packages we want to work with
#| message: false library(ARUtools) library(dplyr) library(purrr) library(tidyr) library(glue) library(lubridate)
Next we'll prepare our metadata on the recordings, by cleaning, adding site-level information and calculating the time to sunrise/sunset for each file. We'll also define recordings as either 'early' (occurring before 6am) or 'late' (occurring after 6am).
s <- clean_site_index(example_sites_clean, name_date = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) |> calc_sun() |> mutate( time_period = if_else(hour(date_time) < 6, "early", "late"), year = year(date) ) m
Time to do some sampling!
First we define the selection parameters for each time frame we're interested in sampling. This might be "dawn" and "dusk", or in this example, "early" and "late" morning.
This function will also simulate the selection weights so we can see what we've defined.
#| fig-width: 12 #| fig-asp: 0.7 #| out-width: 80% p <- list( "early" = sim_selection_weights(min_range = c(-70, 240)), "late" = sim_selection_weights(min_range = c(100, 300), min_mean = 200) ) p
Now we can calculate selection weights
Here we'll calculate a separate set of selection weights for early and late recordings in each year. Then we'll group recordings by site, year, and time period.
w <- m |> nest(data = c(-time_period, -year)) |> mutate( params = p, sel = map2(data, params, calc_selection_weights) ) |> unnest(sel) |> select(-"data", -"params") |> mutate(selection_group = glue("{site_id}_{year}_{time_period}")) w
This w
data sets contains the original sampling recordings, but now also
new columns containing various measures of the probability of selection.
We'll define the number of samples we'd like to have.
n <- w |> summarize(n_recordings = n(), .by = c("selection_group", "time_period")) |> mutate( n = if_else(time_period == "early", 5, 2), n_os = if_else(time_period == "early", floor(n * 1 / 3), floor(n * 1 / 4)), n_os = pmax(0, pmin(n_recordings - n, round(n / 3))), n = pmin(n, n_recordings) ) n
And finally sample the recordings!
g <- sample_recordings(w, n, col_site_id = selection_group, col_sel_weights = psel_normalized ) g
The recordings selected for sampling...
g$sites_base
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.