Description Usage Arguments Value Examples
Generates a bout of one or more syllables with pauses between them. Two basic components are synthesized: the harmonic component (the sum of sine waves with frequencies that are multiples of the fundamental frequency) and the noise component. Both components can be filtered with independently specified formants. Intonation and amplitude contours can be applied both within each syllable and across multiple syllables. Suggested application: synthesis of animal or human non-linguistic vocalizations. For more information, see http://cogsci.se/soundgen.html and the vignette on sound generation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | soundgen(repeatBout = 1, nSyl = 1, sylLen = 300, pauseLen = 200,
pitchAnchors = data.frame(time = c(0, 0.1, 0.9, 1), value = c(100, 150, 135,
100)), pitchAnchorsGlobal = NA, temperature = 0.025,
tempEffects = list(sylLenDep = 0.02, formDrift = 0.3, formDisp = 0.2,
pitchDriftDep = 0.5, pitchDriftFreq = 0.125, pitchAnchorsDep = 0.05,
noiseAnchorsDep = 0.1, amplAnchorsDep = 0.1), maleFemale = 0,
creakyBreathy = 0, nonlinBalance = 0, nonlinDep = 50, jitterLen = 1,
jitterDep = 3, vibratoFreq = 5, vibratoDep = 0, shimmerDep = 0,
attackLen = 50, rolloff = -12, rolloffOct = -12, rolloffKHz = -6,
rolloffParab = 0, rolloffParabHarm = 3, rolloffLip = 6,
formants = list(f1 = list(time = 0, freq = 860, amp = 30, width = 120), f2 =
list(time = 0, freq = 1280, amp = 40, width = 120), f3 = list(time = 0, freq =
2900, amp = 25, width = 200)), formantDep = 1, formantDepStoch = 30,
vocalTract = 15.5, subFreq = 100, subDep = 100, shortestEpoch = 300,
amDep = 0, amFreq = 30, amShape = 0, noiseAnchors = data.frame(time =
c(0, 300), value = c(-120, -120)), formantsNoise = NA, rolloffNoise = -14,
mouthAnchors = data.frame(time = c(0, 1), value = c(0.5, 0.5)),
amplAnchors = NA, amplAnchorsGlobal = NA, samplingRate = 16000,
windowLength = 50, overlap = 75, addSilence = 100, pitchFloor = 50,
pitchCeiling = 3500, pitchSamplingRate = 3500, throwaway = -120,
invalidArgAction = c("adjust", "abort", "ignore")[1], plot = FALSE,
play = FALSE, savePath = NA, ...)
|
repeatBout |
the number of times the whole bout should be repeated |
nSyl |
the number of syllables in the bout. Intonation, amplitude, and formants contours span multiple syllables, but not multiple bouts (see Details) |
sylLen |
average duration of each syllable, ms |
pauseLen |
average duration of pauses between syllables, ms |
pitchAnchors |
a numeric vector of f0 values in Hz (assuming equal time steps) or a dataframe specifying the time (ms) and value (Hz) of each anchor. These anchors are used to create a smooth contour of fundamental frequency f0 (pitch) within one syllable (see Examples) |
pitchAnchorsGlobal |
unlike |
temperature |
hyperparameter for regulating the amount of stochasticity in sound generation |
tempEffects |
a list of scale factors regulating the effect of
temperature on particular parameters. To change, specify just those pars
that you want to modify, don't rewrite the whole list (defaults are
hard-coded). |
maleFemale |
hyperparameter for shifting f0 contour, formants, and vocalTract to make the speaker appear more male (-1...0) or more female (0...+1) |
creakyBreathy |
hyperparameter for a rough adjustment of voice quality from creaky (-1) to breathy (+1) |
nonlinBalance |
hyperparameter for regulating the (approximate) proportion of sound with different regimes of pitch effects (none / subharmonics only / subharmonics and jitter). 0% = no noise; 100% = the entire sound has jitter + subharmonics. Ignored if temperature = 0 |
nonlinDep |
hyperparameter for regulating the intensity of subharmonics and jitter, 0 to 100% (50% = jitter and subharmonics are as specified, <50% weaker, >50% stronger). Ignored if temperature = 0 |
jitterLen |
duration of stable periods between pitch jumps, ms. Use a low value for harsh noise, a high value for irregular vibrato or shaky voice |
jitterDep |
cycle-to-cycle random pitch variation, semitones |
vibratoFreq |
the rate of regular pitch modulation, or vibrato, Hz |
vibratoDep |
the depth of vibrato, semitones |
shimmerDep |
random variation in amplitude between individual glottal cycles (0 to 100% of original amplitude of each cycle) |
attackLen |
duration of fade-in / fade-out at each end of syllables and noise (ms) |
rolloff |
basic rolloff at a constant rate of |
rolloffOct |
basic rolloff changes from lower to upper
harmonics (regardless of f0) by |
rolloffKHz |
rolloff changes linearly with f0 by
|
rolloffParab |
an optional quadratic term affecting only the
first |
rolloffParabHarm |
the number of harmonics affected by
|
rolloffLip |
the effect of lip radiation on source spectrum, dB/oct (the default of +6 dB/oct produces a high-frequency boost when the mouth is open) |
formants |
either a character string like "aaui" referring to default
presets for speaker "M1" or a list of formant times, frequencies,
amplitudes, and bandwidths (see ex. below). |
formantDep |
scale factor of formant amplitude (1 = no change relative
to amplitudes in |
formantDepStoch |
the amplitude of additional stochastic formants added above the highest specified formant, dB (only if temperature > 0) |
vocalTract |
the length of vocal tract, cm. Used for calculating formant dispersion (for adding extra formants) and formant transitions as the mouth opens and closes |
subFreq |
target frequency of subharmonics, Hz (lower than f0, adjusted dynamically so f0 is always a multiple of subFreq) |
subDep |
the width of subharmonic band, Hz. Regulates how quickly the strength of subharmonics fades as they move away from harmonics in f0 stack. Low values produce narrow sidebands, high values produce uniformly strong subharmonics |
shortestEpoch |
minimum duration of each epoch with unchanging subharmonics regime, in ms |
amDep |
amplitude modulation depth, modulation with amplitude range equal to the dynamic range of the sound |
amFreq |
amplitude modulation frequency, Hz |
amShape |
amplitude modulation shape (-1 to +1, defaults to 0) |
noiseAnchors |
a numeric vector of noise amplitudes (-120 dB = none, 0 dB = as loud as voiced component) or a dataframe specifying the time (ms) and amplitude (dB) of anchors for generating the noise component such as aspiration, hissing, etc |
formantsNoise |
the same as |
rolloffNoise |
rolloff of noise, dB/octave. It is analogous to
|
mouthAnchors |
a numeric vector of mouth opening (0 to 1, 0.5 = neutral, i.e. no modification) or a dataframe specifying the time (ms) and value of mouth opening |
amplAnchors |
a numeric vector of amplitude envelope (0 to 1) or a dataframe specifying the time (ms) and value of amplitude anchors |
amplAnchorsGlobal |
a numeric vector of global amplitude envelope spanning multiple syllables or a dataframe specifying the time (ms) and value (0 to 1) of each anchor |
samplingRate |
sampling frequency, Hz |
windowLength |
length of FFT window, ms |
overlap |
FFT window overlap, % |
addSilence |
silence before and after the bout, ms |
pitchFloor, pitchCeiling |
lower & upper bounds of f0 |
pitchSamplingRate |
sampling frequency of the pitch contour only, Hz. Low
values reduce processing time. A rule of thumb is to set this to
the same value as |
throwaway |
discard harmonics and noise that are quieter than this number (in dB, defaults to -120) to save computational resources |
invalidArgAction |
what to do if an argument is invalid or outside the
range in |
plot |
if TRUE, plots a spectrogram |
play |
if TRUE, plays the synthesized sound. In case of errors, try
setting another default player for |
savePath |
full path for saving the output, e.g. '~/Downloads/temp.wav'. If NA (default), doesn't save anything |
... |
other plotting parameters passed to |
Returns the synthesized waveform as a numeric vector.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | # NB: GUI for soundgen is available as a Shiny app.
# Type "soundgen_app()" to start it
playback = c(TRUE, FALSE)[2] # set to TRUE to play back the audio from examples
sound = soundgen(play = playback)
# spectrogram(sound, 16000, osc = TRUE)
# playme(sound)
# Use the in-built collection of presets:
# names(presets) # speakers
# names(presets$Chimpanzee) # calls per speaker
s1 = eval(parse(text = presets$Chimpanzee$Scream_conflict)) # screaming chimp
# playme(s1)
s2 = eval(parse(text = presets$F1$Scream_conflict))
# playme(s2)
# unless temperature is 0, the sound is different every time
for (i in 1:3) sound = soundgen(play = playback, temperature = .2)
# Bouts versus syllables. Compare:
sound = soundgen(formants = 'uai', repeatBout = 3, play = playback)
sound = soundgen(formants = 'uai', nSyl = 3, play = playback)
# Intonation contours per syllable and globally:
sound = soundgen(nSyl = 5, sylLen = 200, pauseLen = 140,
play = playback, pitchAnchors = data.frame(
time = c(0, 0.65, 1), value = c(977, 1540, 826)),
pitchAnchorsGlobal = data.frame(time = c(0, .5, 1), value = c(-6, 7, 0)))
# Subharmonics in sidebands (noisy scream)
sound = soundgen (nonlinBalance = 100, subFreq = 75, subDep = 130,
pitchAnchors = data.frame(
time = c(0, .3, .9, 1), value = c(1200, 1547, 1487, 1154)),
sylLen = 800,
play = playback, plot = TRUE)
# Jitter and mouth opening (bark, dog-like)
sound = soundgen(repeatBout = 2, sylLen = 160, pauseLen = 100,
nonlinBalance = 100, subFreq = 100, subDep = 60, jitterDep = 1,
pitchAnchors = data.frame(time = c(0, 0.52, 1), value = c(559, 785, 557)),
mouthAnchors = data.frame(time = c(0, 0.5, 1), value = c(0, 0.5, 0)),
vocalTract = 5, play = playback)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.