This vignette details the use of template-based detection in ohun.Template-based detection method is better suited for highly stereotyped sound events. As it is less affected by signal-to-noise ratio it's more robust to higher levels of background noise:
The procedure for template-based detection is divided in three steps:
get_templates()
)template_correlator()
) template_detector()
) First, we need to install the package. It can be installed from CRAN as follows:
# From CRAN would be install.packages("ohun") #load package library(ohun)
To install the latest developmental version from github you will need the R package remotes:
# install package remotes::install_github("maRce10/ohun") #load packages library(ohun) library(tuneR) library(warbleR)
#load packages library(ohun) library(tuneR) library(warbleR) # for spectrograms par(mar = c(5, 4, 2, 2) + 0.1) stopifnot(require(knitr)) options(width = 90) opts_chunk$set( comment = NA, # eval = if (isTRUE(exists("params"))) params$EVAL else FALSE, dev = "jpeg", dpi = 100, fig.width=10, out.width = "100%", fig.align = "center" )
The package comes with an example reference table containing annotations of long-billed hermit hummingbird songs from two sound files (also supplied as example data: 'lbh1' and 'lbh2'), which will be used in this vignette. The example data can be load and explored as follows:
# load example data data("lbh1", "lbh2", "lbh_reference") # save sound files tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav")) tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav")) # select a subset of the data lbh1_reference <- lbh_reference[lbh_reference$sound.files == "lbh1.wav",] # print data lbh1_reference
We can also plot the annotations on top of the spectrograms to further explore the data (this function only plots one wave object at the time, not really useful for long files):
# print spectrogram label_spectro(wave = lbh1, reference = lbh1_reference, hop.size = 10, ovlp = 50, flim = c(1, 10))
The function get_templates()
can help you find a template closer to the average acoustic structure of the sound events in a reference table. This is done by finding the sound events closer to the centroid of the acoustic space. When the acoustic space is not supplied ('acoustic.space' argument) the function estimates it by measuring several acoustic parameters using the function spectro_analysis()
from warbleR
) and summarizing it with Principal Component Analysis (after z-transforming parameters). If only 1 template is required the function returns the sound event closest to the acoustic space centroid. The rationale here is that a sound event closest to the average sound event structure is more likely to share structural features with most sounds across the acoustic space than a sound event in the periphery of the space. These 'mean structure' templates can be obtained as follows:
# get mean structure template template <- get_templates(reference = lbh1_reference, path = tempdir())
par(mar = c(5, 4, 1, 1), bg = "white") # get mean structure template template <- get_templates(reference = lbh1_reference, path = tempdir())
The graph above shows the overall acoustic spaces, in which the sound closest to the space centroid is highlighted. The highlighted sound is selected as the template and can be used to detect similar sound events. The function get_templates()
can also select several templates. This can be helpful when working with sounds that are just moderately stereotyped. This is done by dividing the acoustic space into sub-spaces defined as equal-size slices of a circle centered at the centroid of the acoustic space:
# get 3 templates get_templates(reference = lbh1_reference, n.sub.spaces = 3, path = tempdir())
par(mar = c(5, 4, 1, 1), bg = "white") # get 3 templates templates <- get_templates(reference = lbh1_reference, n.sub.spaces = 3, path = tempdir())
We will use the single template object ('template') to run a detection on the example 'lbh1' data:
# get correlations correlations <- template_correlator(templates = template, files = "lbh1.wav", path = tempdir())
The output is an object of class 'template_correlations', with its own printing method:
# print
correlations
This object can then be used to detect sound events using template_detector()
:
# run detection detection <- template_detector(template.correlations = correlations, threshold = 0.7) detection
The output can be explored by plotting the spectrogram along with the detection and correlation scores:
# plot spectrogram label_spectro( wave = lbh1, detection = detection, template.correlation = correlations[[1]], flim = c(0, 10), threshold = 0.7, hop.size = 10, ovlp = 50)
The performance can be evaluated using diagnose_detection()
:
#diagnose diagnose_detection(reference = lbh1_reference, detection = detection)
The function optimize_template_detector()
allows to evaluate the performance under different correlation thresholds:
# run optimization optimization <- optimize_template_detector( template.correlations = correlations, reference = lbh1_reference, threshold = seq(0.1, 0.5, 0.1) ) # print output optimization
Additional threshold values can be evaluated without having to run it all over again. We just need to supplied the output from the previous run with the argument previous.output
(the same trick can be done when optimizing an energy-based detection):
# run optimization optimize_template_detector( template.correlations = correlations, reference = lbh1_reference, threshold = c(0.6, 0.7), previous.output = optimization )
In this case 2 threshold values (0.5 and 0.6) can achieve an optimal detection.
Several templates can be used within the same call. Here we correlate two templates on the two example sound files, taking one template from each sound file:
# get correlations correlations <- template_correlator( templates = lbh1_reference[c(1, 10),], files = "lbh1.wav", path = tempdir() ) # run detection detection <- template_detector(template.correlations = correlations, threshold = 0.6)
Note that in these cases we can get the same sound event detected several times (duplicates), one by each template. We can check if that is the case just by diagnosing the detection:
#diagnose diagnose_detection(reference = lbh1_reference, detection = detection)
In this we got perfect recall, but low precision. That is due to the fact that the same reference events were picked up by both templates. We can actually diagnose the detection by template to see this more clearly:
#diagnose diagnose_detection(reference = lbh1_reference, detection = detection, by = "template")
We see that independently each template achieved a good detection. We can create a consensus detection by leaving only those with detections with the highest correlation across templates. To do this we first need to label each row in the detection using label_detection()
and then remove duplicates using consensus_detection()
:
# labeling detection labeled <- label_detection(reference = lbh_reference, detection = detection, by = "template")
Now we can filter out duplicates and diagnose the detection again, telling the function to select a single row per duplicate using the correlation score as a criterium (by = "scores"
, this column is part of the template_detector()
output):
# filter consensus <- consensus_detection(detection = labeled, by = "scores") # diagnose diagnose_detection(reference = lbh1_reference, detection = consensus)
We successfully get rid of duplicates and detected every single target sound event.
Session information
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.