Getting started with ARUtools"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
options(tibble.print_min = 4L, tibble.print_max = 4L)

The ARUtools package aims to make processing of large quantities of acoustic recordings easier through automation of metadata processing and sub-sampling of recordings.

Prior to working on your ARU recordings or meta data you must:

This introduction will walk through the first few steps of extracting the metadata, adding site information, and calculating sunrise and sunset information.

Read file metadata

#| message: false
library(ARUtools)

Let's use some example data to get started.

head(example_files)

This is a list of hypothetical ARU files from different sites, and using different ARUs. This is fairly messily organized data in that there is no clear structure to the folders and there appear to be unneeded characters in the files. However give the standard structure of site names, ARU ID codes, and datetime stamps, we can extract that information from the file structure alone.

First things first, we'll clean up the meta data associated with the files.

m <- clean_metadata(project_files = example_files)

Because our example files follow the standard formats for Site ID, ARU Id, and date/time, we can extract all the information without having to change any of the default arguments.

m

If you were reading directly from files you would assign a base directory and then have clean_metadata read the files in that folder and sub-folders.

base_directory <- "/path/to/project/files/"
m <- clean_metadata(project_dir = base_directory)

Add coordinates

Next, we want to add our coordinates to this data.

If your data has GPS logs included, they would have been detected in the above step and you could now use g <- clean_gps(m) to create a list of GPS coordinates.

However, many models of ARUs do not have an internal GPS and those that do, may not accurately record the location where the ARU is deployed to. Therefore we recommend that you create a site index file to manually record deployment locations, like this one.

example_sites

While you can simply specify a single date, it is recommended that you use both a start date and an end date for the best matching. This is critical if you are moving your ARUs during a season.

Now let's clean up this list so we can add these sites to our metadata.

sites <- clean_site_index(example_sites)

Ooops! We can see right away that clean_site_index() expects the data to be in a particular format. Luckily we can let it know if we've used a different format.

sites <- clean_site_index(example_sites,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = c("Date_set_out", "Date_removed"),
  name_coords = c("lon", "lat")
)

Hmm, that's an interesting message! This means that some of our deployment dates overlap. ARUtools assumes that if you set out an ARU on a specific day, you probably didn't set it out at midnight (i.e. the very start of that day). Since we assume you are likely using ARUs for recording in the early morning or late at night, we shift the dates start/end times to noon as an estimate of when the ARU was likely deployed.

If your ARU was deployed at midnight, use resolve_ovelaps = FALSE. Or, if you know the exact time your ARU was deployed, use a date/time rather than just a date in your site index.

sites

Note that we've lost a couple of non-standard columns: Plots and Subplot.

We can retain these by specifying cols_extra.

sites <- clean_site_index(example_sites,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = c("Date_set_out", "Date_removed"),
  name_coords = c("lon", "lat"),
  name_extra = c("Plots", "Subplot")
)
sites

We can even be fancy and rename them for consistency by using named vectors.

sites <- clean_site_index(example_sites,
  name_aru_id = "ARU",
  name_site_id = "Sites",
  name_date_time = c("Date_set_out", "Date_removed"),
  name_coords = c("lon", "lat"),
  name_extra = c("plot" = "Plots", "subplot" = "Subplot")
)
sites

Now let's add this site-related information to our metadata.

m <- add_sites(m, sites)
m

Calculate times to sunrise and sunset

Great! We have all the site-related information to describe that recording.

Now to prepare for our selection procedure, the last thing we need to do is calculate the time to sunrise or sunset.

Here we need to be clear about what timezone the ARU unit was recording times as.

There are two options.

The first option is that all ARUs were set up at home base before deployment. In this case it's possible they were deployed in a location with a different timezone than what they were recording in. This doesn't matter, as long as you specify the programmed timezone here. In this case, use tz = "America/Toronto", or whichever time zone was used. Note that timezones must be one of OlsonNames().

The second option is that each ARU unit was set up to record in the local timezone where it was placed. If this is the case, specify tz = "local" and the calc_sun() function will use coordinates to determine local timezones.

(See the Dealing with Timezones vignette for more details).

In our example, let's assume that the ARUs were set up in each location they were deployed. So we'll use tz = "local", the default setting.

m <- calc_sun(m)
dplyr::glimpse(m)

Tada! Now we have a complete set of cleaned metadata associated with each recording.

This is a very simple example and much of the pain in large projects comes from complications, so be sure to check out vignette("customizing") and vignette("spatial") to dig into some of these issues.

Next steps

Now that we have a set of cleaned metadata the next step is to select recordings. To do this using a random sampling approach check out the subsampling article vignette("SubSample").



Try the ARUtools package in your browser

Any scripts or data that you put into this service are public.

ARUtools documentation built on Oct. 9, 2024, 1:07 a.m.