matchPars: Match soundgen pars (experimental)

Description Usage Arguments Value Examples

View source: R/matchPars.R

Description

Attempts to find settings for soundgen that will reproduce an existing sound. The principle is to mutate control parameters, trying to improve fit to target. The currently implemented optimization algorithm is simple hill climbing. Disclaimer: this function is experimental and may or may not work for particular tasks. It is intended as a supplement to - not replacement of - manual optimization. See the vignette on sound generation for more information.

Usage

1
2
3
4
5
6
matchPars(target, samplingRate = NULL, pars = NULL, init = NULL,
  method = c("cor", "cosine", "pixel", "dtw"), probMutation = 0.25,
  stepVariance = 0.1, maxIter = 50, minExpectedDelta = 0.001,
  windowLength = 40, overlap = 50, step = NULL, verbose = TRUE,
  padWith = NA, penalizeLengthDif = TRUE, throwaway = -120,
  maxFreq = NULL)

Arguments

target

the sound we want to reproduce using soundgen: path to a .wav file or numeric vector

samplingRate

sampling rate of target (only needed if target is a numeric vector, rather than a .wav file)

pars

arguments to soundgen that we are attempting to optimize

init

a list of initial values for the optimized parameters pars and the values of other arguments to soundgen that are fixed at non-default values (if any)

method

method of comparing mel-transformed spectra of two sounds: "cor" = average Pearson's correlation of mel-transformed spectra of individual FFT frames; "cosine" = same as "cor" but with cosine similarity instead of Pearson's correlation; "pixel" = absolute difference between each point in the two spectra; "dtw" = discrete time warp with dtw

probMutation

the probability of a parameter mutating per iteration

stepVariance

scale factor for calculating the size of mutations

maxIter

maximum number of mutated sounds produced without improving the fit to target

minExpectedDelta

minimum improvement in fit to target required to accept the new sound candidate

windowLength

length of FFT window, ms

overlap

overlap between successive FFT frames, %

step

you can override overlap by specifying FFT step, ms

verbose

if TRUE, plays back the accepted candidate at each iteration and reports the outcome

padWith

compared spectra are padded with either silence (padWith = 0) or with NA's (padWith = NA) to have the same number of columns. When the sounds are of different duration, padding with zeroes rather than NA's improves the fit to target measured by method = 'pixel' and 'dtw', but it has no effect on 'cor' and 'cosine'.

penalizeLengthDif

if TRUE, sounds of different length are considered to be less similar; if FALSE, only the overlapping parts of two sounds are compared

throwaway

parts of the spectra quieter than throwaway dB are not compared

maxFreq

parts of the spectra above maxFreq Hz are not compared

Value

Returns a list of length 2: $history contains the tried parameter values together with their fit to target ($history$sim), and $pars contains a list of the final - hopefully the best - parameter settings.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
playback = c(TRUE, FALSE)[2]  # set to TRUE to play back the audio from examples

target = soundgen(repeatBout = 3, sylLen = 120, pauseLen = 70,
  pitchAnchors = data.frame(time = c(0, 1), value = c(300, 200)),
  rolloff = -5, play = playback)  # we hope to reproduce this sound

# Match pars based on acoustic analysis alone, without any optimization.
# This *MAY* match temporal structure, pitch, and stationary formants
m1 = matchPars(target = target,
               samplingRate = 16000,
               maxIter = 0,  # no optimization, only acoustic analysis
               verbose = playback)
cand1 = do.call(soundgen, c(m1$pars, list(play = playback, temperature = 0)))

# Try to improve the match by optimizing rolloff
# (this may take a few minutes to run, and the results may vary)
## Not run: 
m2 = matchPars(target = target,
               samplingRate = 16000,
               pars = 'rolloff',
               maxIter = 100,
               verbose = playback)
# rolloff should be moving from default (-12) to target (-5):
sapply(m2$history, function(x) x$pars$rolloff)
cand2 = do.call(soundgen, c(m2$pars, list(play = playback, temperature = 0)))

## End(Not run)

tatters/soundgen_beta documentation built on May 14, 2019, 9 a.m.