compareSounds: Compare sounds (experimental)

Description Usage Arguments Examples

View source: R/matchPars.R

Description

Computes similarity between two sounds based on correlating mel-transformed spectra (auditory spectra). Called by matchPars.

Usage

1
2
3
4
compareSounds(target, targetSpec = NULL, cand, samplingRate = NULL,
  method = c("cor", "cosine", "pixel", "dtw")[1:4], windowLength = 40,
  overlap = 50, step = NULL, padWith = NA, penalizeLengthDif = TRUE,
  throwaway = -120, maxFreq = NULL, summary = TRUE)

Arguments

target

the sound we want to reproduce using soundgen: path to a .wav file or numeric vector

targetSpec

if already calculated, the target auditory spectrum can be provided to speed things up

cand

the sound to be compared to target

samplingRate

sampling rate of target (only needed if target is a numeric vector, rather than a .wav file)

method

method of comparing mel-transformed spectra of two sounds: "cor" = average Pearson's correlation of mel-transformed spectra of individual FFT frames; "cosine" = same as "cor" but with cosine similarity instead of Pearson's correlation; "pixel" = absolute difference between each point in the two spectra; "dtw" = discrete time warp with dtw

windowLength

length of FFT window, ms

overlap

overlap between successive FFT frames, %

step

you can override overlap by specifying FFT step, ms

padWith

compared spectra are padded with either silence (padWith = 0) or with NA's (padWith = NA) to have the same number of columns. When the sounds are of different duration, padding with zeroes rather than NA's improves the fit to target measured by method = 'pixel' and 'dtw', but it has no effect on 'cor' and 'cosine'.

penalizeLengthDif

if TRUE, sounds of different length are considered to be less similar; if FALSE, only the overlapping parts of two sounds are compared

throwaway

parts of the spectra quieter than throwaway dB are not compared

maxFreq

parts of the spectra above maxFreq Hz are not compared

summary

if TRUE, returns the mean of similarity values calculated by all methods in method

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
target = soundgen(sylLen = 500, formants = 'a',
                  pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1),
                                            value = c(100, 150, 135, 100)),
                  temperature = 0)
targetSpec = soundgen:::getMelSpec(target, samplingRate = 16000)
parsToTry = list(
  list(formants = 'i',                                            # wrong
       pitchAnchors = data.frame(time = c(0, 1),                  # wrong
                                 value = c(200, 300))),
  list(formants = 'i',                                            # wrong
       pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1),        # right
                                 value = c(100, 150, 135, 100))),
  list(formants = 'a',                                            # right
       pitchAnchors = data.frame(time = c(0,1),                   # wrong
                                 value = c(200, 300))),
  list(formants = 'a',
       pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1),        # right
                                 value = c(100, 150, 135, 100)))  # right
)

sounds = list()
for (s in 1:length(parsToTry)) {
  sounds[[length(sounds) + 1]] =  do.call(soundgen,
    c(parsToTry[[s]], list(temperature = 0, sylLen = 500)))
}

method = c('cor', 'cosine', 'pixel', 'dtw')
df = matrix(NA, nrow = length(parsToTry), ncol = length(method))
colnames(df) = method
df = as.data.frame(df)
for (i in 1:nrow(df)) {
  df[i, ] = compareSounds(
    target = NULL,            # can use target instead of targetSpec...
    targetSpec = targetSpec,  # ...but faster to calculate targetSpec once
    cand = sounds[[i]],
    samplingRate = 16000,
    padWith = NA,
    penalizeLengthDif = TRUE,
    method = method,
    summary = FALSE
  )
}
df$av = rowMeans(df, na.rm = TRUE)
df  # row 1 = wrong pitch & formants, ..., row 4 = right pitch & formants

tatters/soundgen_beta documentation built on May 14, 2019, 9 a.m.