Description Usage Arguments Examples
Computes similarity between two sounds based on correlating mel-transformed
spectra (auditory spectra). Called by matchPars
.
1 2 3 4 |
target |
the sound we want to reproduce using soundgen: path to a .wav file or numeric vector |
targetSpec |
if already calculated, the target auditory spectrum can be provided to speed things up |
cand |
the sound to be compared to |
samplingRate |
sampling rate of |
method |
method of comparing mel-transformed spectra of two sounds:
"cor" = average Pearson's correlation of mel-transformed spectra of
individual FFT frames; "cosine" = same as "cor" but with cosine similarity
instead of Pearson's correlation; "pixel" = absolute difference between
each point in the two spectra; "dtw" = discrete time warp with
|
windowLength |
length of FFT window, ms |
overlap |
overlap between successive FFT frames, % |
step |
you can override |
padWith |
compared spectra are padded with either silence ( |
penalizeLengthDif |
if TRUE, sounds of different length are considered to be less similar; if FALSE, only the overlapping parts of two sounds are compared |
throwaway |
parts of the spectra quieter than |
maxFreq |
parts of the spectra above |
summary |
if TRUE, returns the mean of similarity values calculated by
all methods in |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | target = soundgen(sylLen = 500, formants = 'a',
pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1),
value = c(100, 150, 135, 100)),
temperature = 0)
targetSpec = soundgen:::getMelSpec(target, samplingRate = 16000)
parsToTry = list(
list(formants = 'i', # wrong
pitchAnchors = data.frame(time = c(0, 1), # wrong
value = c(200, 300))),
list(formants = 'i', # wrong
pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1), # right
value = c(100, 150, 135, 100))),
list(formants = 'a', # right
pitchAnchors = data.frame(time = c(0,1), # wrong
value = c(200, 300))),
list(formants = 'a',
pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1), # right
value = c(100, 150, 135, 100))) # right
)
sounds = list()
for (s in 1:length(parsToTry)) {
sounds[[length(sounds) + 1]] = do.call(soundgen,
c(parsToTry[[s]], list(temperature = 0, sylLen = 500)))
}
method = c('cor', 'cosine', 'pixel', 'dtw')
df = matrix(NA, nrow = length(parsToTry), ncol = length(method))
colnames(df) = method
df = as.data.frame(df)
for (i in 1:nrow(df)) {
df[i, ] = compareSounds(
target = NULL, # can use target instead of targetSpec...
targetSpec = targetSpec, # ...but faster to calculate targetSpec once
cand = sounds[[i]],
samplingRate = 16000,
padWith = NA,
penalizeLengthDif = TRUE,
method = method,
summary = FALSE
)
}
df$av = rowMeans(df, na.rm = TRUE)
df # row 1 = wrong pitch & formants, ..., row 4 = right pitch & formants
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.