In BioAcoustica/orthophonia: todo

Quantitative description of song structure based on beat spectra

The problem of the structure of songs

The basic component of Orthoptera songs is the pulse. Therefore, most studies focus on pulse (and inter-pulse) features. However, songs of some species are organised in a very hierarchical fashion -- pulses are grouped to form syllable, and syllables make echemes.

Higher level structures (syllable and echemes) may contain some valuable information to define/classify songs, and ought to be characterised.

The first part of this document will provide and example of complex song hierarchy. Afterwards, it will explain how beat spectra can be used to encapsulate such structures. Finally, it will describe how they can be used to classify songs.

An example of elaborate song structure

The following song, from Chorthippus dorsatus (bioacoustica annotation 277) is a good illustration of this hierarchical organasition.

# preparing data
library(bioacoustica)
library(orthophonia)
library(data.table)
library(ggplot2)

DATA_DIR <- "~/Desktop/ortho_data/"

all_annotations = getAllAnnotationData()
query = all_annotations[id == 277]
my_annotations <- dowloadFilesForAnnotations(query,
                                             dst_dir = DATA_DIR,
                                             verbose=T)
#the example signal
wave_expl <- standardiseWave(my_annotations[id==277, annotation_path])
# a data table used only to plot signal
dt_expl <- data.table(x=wave_expl@left, t = 1:length(wave_expl@left)/wave_expl@samp.rate)

The whole song looks like that:

ggplot(dt_expl[seq(from=1, to=.N, by=20)], aes(t, x)) +
    geom_line() +
    annotate("rect", xmin = 0.4, xmax = 1.25,
             ymin = -Inf, ymax = +Inf, alpha = .2,
             colour="blue", fill="blue")+
    labs(title="Segment of Chorthippus dorsatus song",
         x="t (s)")

At this scale, we can see that the song contains high-level structures, echemes (e.g. the blue rectangle), that have a periodicity of about 1s.

If we select this echeme and study its structure:

ggplot(dt_expl[t> 0.4 & t<1.25][seq(from=1, to=.N, by=5)], aes(t, x)) +
  annotate("rect", xmin = 0.65, xmax = 0.69,
             ymin = -Inf, ymax = +Inf, alpha = .2,
             colour="blue", fill="blue")+
    geom_line() +
    labs(title="Segment of Chorthippus dorsatus song",
         x="t (s)")

We can observe that it contains syllables that are separated by around 80ms.

Now, at an even finer resolution,

ggplot(dt_expl[t> 0.65 & t<0.69], aes(t *1e3, x)) +
    geom_line() +
    annotate("rect", xmin = 660.5, xmax = 663.5,
             ymin = -Inf, ymax = +Inf, alpha = .2,
             colour="blue", fill="blue")+
    labs(title="Segment of Chorthippus dorsatus song",
         x="t (ms)")

We see distinct pulses with a periodicity around 4ms.

Altogether, this song has three layers of periodicity. In order to describe such structure, one needs a tool capable of providing good enough frequency (i.e. period) resolution at a large range of scales. In this example, the rhythmicity spans over three orders of magnitudes (i.e. nearly 10 octaves).

The beat spectrum

My work has so far conducted me to devise an algorithm to extract information about the rhythm of songs. In order to remain relatively agnostic to the expected range of period expected, I used continuous wavelet transform as it naturally provides dynamic range (log) resolution of periods. This tool can generate a "beat spectrum", which is a period (or frequency) representation of the rhythmicity the sound envelope.

If we apply it to the previous example, we get:

wave_expl <- autoBandPassFilter(wave_expl)
dt <- beatSpectrum(wave_expl)

arrows <- data.table(xend=c(5,80,1400), yend=c(0.35,0.15,0.4))
arrows$x = arrows$xend * 1.2
arrows$y = arrows$yend + 0.1

ggplot(dt, aes(period*1000, power)) +
  geom_bar(stat="identity", colour="blue", fill="blue") +
  scale_x_log10(breaks=c(10^(0:3), 3*10^(0:3))) +
  labs(title="Beat spectrum of\nChorthippus dorsatus song",
         x="Period (ms)")+
  geom_segment(data=arrows,aes(x=x,xend=xend,y=y,yend=yend),arrow=arrow(length=unit(0.2,"cm")), colour="red")

The red arrows indicate the three peaks corresponding to the three levels of rhythmicity. They are at 5ms, 80ms and 1.4s, from left to right, which matches the results we obtained through graphical inspection.

Classification of beat spectra

The idea is to generalise the use of such tool to classify songs. For instance, we could several pull songs from two species: Chorthippus dorsatus and Chorthippus brunneus, and attempt to cluster them based on their rhythmicity.

DATA_DIR <- "~/Desktop/ortho_data/"
taxa <- c("Chorthippus Glyptobothrus brunneus","Chorthippus Chorthippus dorsatus")
all_annotations = getAllAnnotationData()
query = all_annotations[taxon %in% taxa & author == "qgeissmann"]
my_annotations <- dowloadFilesForAnnotations(query,
                                             dst_dir = DATA_DIR,
                                             verbose=T,
                                             force=F)

# keep annotation shorted than 30s, for speed
query <- query[ end - start < 30 ]

# This is what we would like to do to each annotation:
pipelineFunction <- function(file_name){
  #print(file_name)
  wave <- standardiseWave(file_name)
  #wave <- autoBandPassFilter(wave)
  beatSpectrum(wave)
}


# we use data table `by` to do the job, this can take some time:
dt <- my_annotations[, pipelineFunction(annotation_path), by=id]

print(dt)
dt <- my_annotations[dt]

ggplot(dt, aes(period, power, group=id)) + geom_line(alpha=.3) +
   scale_x_log10(name="period (s)", limits=c(1e-3,5e1)) + facet_wrap(~ taxon, nrow=2 )

Now, lets perform a hierarchical clustering. Here we use Dynamic Time Warping (DTW) as we want to essentially align time (period) series. For that we can use the dtwclustpackage.

library(dtwclust)
library(ggdendro)

ctrl <- new("dtwclustControl", window.size = 20L, trace = TRUE)
datalist <- split(dt$power, dt$id)
hc.sbd <- dtwclust(datalist, type = "hierarchical",
                   k = 19:21, distance = "sbd",
                   method = "all",
                   control = ctrl, seed=1234)
cvis <- sapply(hc.sbd, cvi, b = names(datalist))

#labels <- paste(unique(dt)$taxon_id, ,sep="_")
result <- hc.sbd[[which.min(cvis["VI", ])]]
result$labels <- unique(dt)$id

ddata <- dendro_data(result, type = "rectangle")

labs <- as.data.table(ddata$labels)
labs$id = labs$label
setkey(labs,id)
tmp_dt <- unique(dt)[,.(id,taxon_id)]
tmp_dt[,id :=as.character(id)]

setkey(tmp_dt,id)

labs <- tmp_dt[labs]


ggplot(segment(ddata)) + 
  geom_segment(aes(x = x, y = y, xend = xend, yend = yend)) + 
  geom_text(data = labs, 
            aes(x = x, y = y, label = label,
                colour=taxon_id), 
                vjust = 0.5, hjust=0) + 
  coord_flip() + scale_y_reverse(expand = c(0.2, 0)) +
  theme_dendro()

Here I have labelled leaves as taxon.id_annotation.id. Taxon ids 77 and 207 are Chorthippus Glyptobothrus brunneus and Chorthippus Chorthippus dorsatus, respectively.

This unsupervised approach performs overall quite well, but we can spot several outliers. Namely, annotations 274 and 244 seem misclassified as dorsatus and annotation 278 as brunneus.

Listening to these individually provides some explanation:

Annotation 278 has only two echemes so we fail to capture the highest level of rhythmicity.
To my inexperienced human hear annotation 244 does not sound like other brunnei samples at all. In particular, echemes are very long and composed of multiple syllables
Annotation 274 is also structurally very different from other samples of the same taxon (brunneus).

In conclusion, beat spectrum can be used to characterised structure of songs. The misclassification observed in this example results from clear outliers.

There are several ways to improve classification:

Curate original data by ensuring annotations are not mislabelled (see annotation 244 and 274).
Obtain more data in order to have a representative sample of the expected structural diversity.
Combine beat spectrum clustering with power spectrum (frequency) clustering.

BioAcoustica/orthophonia documentation built on May 5, 2019, 3:46 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com