batch_process: Main function to run all steps on a folder of sound files

View source: R/batch_process.R

batch_processR Documentation

Main function to run all steps on a folder of sound files

Description

Main function of the package 'NocMigR' that allows to analyse a suite of long-term recordings from scratch by executing five distinct steps (see 'details' section.).*Under the hood*, this function is calling algorithms of the fabulous R packages tuneR, warbleR, seewave and bioacoustics. **Important note**: Especially for large sets of recordings (e.g., 'AudioMoth' deployed for a weak, ~ 60 GB data) R can easily run into memory issues. This can surely be tackled by coding functions more efficiently. For now, the best way to handle this issue is to resume the script after it broke (see output in the console and check steps argument).

Usage

batch_process(
  path = NULL,
  format = c("WAV", "wav", "mp3", "MP3"),
  steps = 1:6,
  rename = FALSE,
  segment = NULL,
  mono = TRUE,
  downsample = NULL,
  rescale = NULL,
  SNR = 8,
  buffer = 1,
  max.events = 999,
  target = td_presets("Bubo bubo"),
  recorder = c("AudioMoth", "Olympus LS-3", "Sony PCM-D100"),
  time = c("mtime", "ctime"),
  write_text = FALSE,
  .onsplit = TRUE
)

Arguments

path

Path to a set of recordings (all same format and continuous time span). Important note: File are expected to be named using a YYYYMMDD_HHMMSS string or set reaname = TRUE to allow renaming. Files including the extensions "_extracted.WAV" or "merged_events.WAV" are reserved to write output files and ignored as inputs.

format

Format of sound files (default and suggested is to use WAV).

steps

Numeric or character vector, by default steps 1:5 are executed. (1 = rename_recording, 2 = split_wave, 3 = find_events, 4 = join_audacity & 5 = extract_events).

rename

Logical, allows to rename recordings (default FALSE).

segment

Null, or numeric value giving segment size for split_wave in seconds. (default NULL)

mono

Logical. By default, split_wave coerces stereo files to mono prior to event detection (default TRUE). If kept as stereo file the left channel will used in find_events.

downsample

Null or re-sampling factor used in split_wave (default NULL).

rescale

optional. allows to resacale the wav to a new bit rate (e.g., "8", "16", "24").

SNR

Numeric value (dB) specifying signal to noise ratio for find_events (default 8).

buffer

Buffer in seconds added to before and after the event (default 1). Controls also the detection of overlapping events.

max.events

Numeric, giving the maximum number of events before a file is skipped (default 999). Usually very high detection rates indicate an issue with noise (e.g., wind or rain).

target

data frame specifying parameter values used by threshold_detection to detect events. Values are parsed on as they are. Default is a call to td_presets.

recorder

Currently three templates to ensure correct handling of times. Only relevant if rename = TRUE!.

time

Controls, if ctime or mtime is used to compute date_time objects. Only relevant if rename = TRUE!

write_text

logical, if TRUE exports a text file with information about file renaming

.onsplit

Logical. by default searches for sub folder split and bases analyses on segmented files if found. Also switched to TRUE if segment is not NULL.

Details

By default, runs all steps (currently five, see steps) of the analysis workflow consecutively. Recordings of a project (i.e., usually continuous signal or time-expanded if otherwise) need to be saved to a single directory, specified as path argument:

1.) If steps = 1 or 'rename_audio': Attempts to rename audio files to YYYYMMDD_HHMMSS format, where the date and time at the onset of the recording are coded in the file name. This steps ensures that all downstream algorithms can compute the correct dates and times of events. If files are already formatted correctly (e.g, capture by AudioMoth) this steps can be skipped (default behaviour) by either setting rename = FALSE and/or excluding '1' from steps. Internally, calls the function rename_recording.

2.) If steps = 2 or 'split_wave': Attempts to split large audio files in chunks controlled by segment to reduce the file size prior to calling event detection algorithms. Within the parent folder (path) a sub folder "split" is created to dump the files. Each file is named with the correct ‘YYYYMMDD_HHMMSS' string. If files are already formatted correctly (e.g, capture by 'AudioMoth') this steps can be skipped by removing ’2" from the vector steps. Internally, calls the function split_wave.

3.) If steps = 3 or 'find_events': Queries bioacoustics::threshold_detection to detect events based on signal-to-noise ratios (SNR). When events are found, a 'txt'file based on the file name of the recording is created with labels for reviewing in 'Audacity'. Internally, calls the function find_events.

4.) If steps = 4 or ”join_audacity': If event detection is based on segmented files (i.e., sub folder 'split' exists), loops through text file containing Audacity labels and merges with respect to the original file (as matched by date and time overlap).

5.) If steps = 5 or 'extract_events': Extract events from (full-length recordings) and writes them to a a new wave file with extension 'extracted.wav'. Additionally, creates file with Audacity labels (extension extracted.txt).

6.) If steps = 6 or 'merge_events': Concatenates files holding extracted events along with their labels to merge the output of a project.

Value

Data frame with extracted events if extract_events was queried.


mottensmann/NocMigR documentation built on Oct. 3, 2023, 3:36 a.m.