getSpectralEnvelope: Spectral envelope
In tatters/soundgen_beta: Parametric Voice Synthesis

Description Usage Arguments Value Examples

Prepares a spectral envelope for filtering a sound to add formants, lip radiation, and some stochastic component regulated by temperature. Formants are specified as a list containing time, frequency, amplitude, and width values for each formant (see examples). NB: each formant is generated as a gamma distribution with mean = freq and SD = width. Formant bandwidths in soundgen are therefore NOT compatible with formant bandwidths used in Klatt synthesizer and other algorithms that rely on FIR instead of FFT.

getSpectralEnvelope(nr, nc, formants = NA, formantDep = 1, rolloffLip = 6,
  mouthAnchors = NA, mouthOpenThres = 0, openMouthBoost = 0,
  vocalTract = NULL, temperature = 0, formDrift = 0.3, formDisp = 0.2,
  formantDepStoch = 30, smoothLinearFactor = 1, samplingRate = 16000,
  speedSound = 35400, plot = FALSE, duration = NULL,
  colorTheme = c("bw", "seewave", "...")[1], nCols = 100, xlab = "Time",
  ylab = "Frequency, kHz", ...)

`nr`	the number of frequency bins = windowLength_points/2, where windowLength_points is the size of window for Fourier transform
`nc`	the number of time steps for Fourier transform
`formants`	either a character string like "aaui" referring to default presets for speaker "M1" or a list of formant times, frequencies, amplitudes, and bandwidths. `formants = NA` defaults to schwa. Time stamps for formants and mouthOpening can be specified in ms or an any other arbitarary scale.
`formantDep`	scale factor of formant amplitude (1 = no change relative to amplitudes in `formants`)
`rolloffLip`	the effect of lip radiation on source spectrum, dB/oct (the default of +6 dB/oct produces a high-frequency boost when the mouth is open)
`mouthAnchors`	a numeric vector of mouth opening (0 to 1, 0.5 = neutral, i.e. no modification) or a dataframe specifying the time (ms) and value of mouth opening
`mouthOpenThres`	the mouth is considered to be open when its opening is greater than `mouthOpenThres`. Defaults to 0
`openMouthBoost`	amplify the voice when the mouth is open by `openMouthBoost` dB
`vocalTract`	the length of vocal tract, cm. Used for calculating formant dispersion (for adding extra formants) and formant transitions as the mouth opens and closes
`temperature`	hyperparameter for regulating the amount of stochasticity in sound generation
`formDrift`	scale factor regulating the effect of temperature on the depth of random drift of all formants (user-defined and stochastic): the higher, the more formants drift at a given temperature
`formDisp`	scale factor regulating the effect of temperature on the irregularity of the dispersion of stochastic formants: the higher, the more unevenly stochastic formants are spaced at a given temperature
`formantDepStoch`	the amplitude of additional formants added above the highest specified formant (only if temperature > 0)
`smoothLinearFactor`	regulates smoothing of formant anchors (0 to +Inf) as they are upsampled to the number of fft steps `nc`. This is necessary because the input `formants` normally contains fewer sets of formant values than the number of fft steps. `smoothLinearFactor` = 0: close to default spline; >3: approaches linear extrapolation
`samplingRate`	sampling frequency, Hz
`speedSound`	speed of sound in warm air, cm/s. Stevens (2000) "Acoustic phonetics", p. 138
`plot`	if TRUE, produces a plot of the spectral envelope
`duration`	duration of the sound, ms (for plotting purposes only)
`colorTheme`	black and white ('bw'), as in seewave package ('seewave'), or another color theme (e.g. 'heat.colors')
`nCols`	number of colors in the palette
`xlab, ylab`	labels of axes
`...`	other graphical parameters passed on to `image()`

Returns a spectral filter (matrix nr x nc, where nr is the number of frequency bins = windowLength_points/2 and nc is the number of time steps)

# [a] with F1-F3 visible
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = soundgen:::convertStringToFormants('a'),
  temperature = 0)))
# some "wiggling" of specified formants plus extra formants on top
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = soundgen:::convertStringToFormants('a'),
  temperature = 0.1, formantDepStoch = 10)))
# stronger extra formants
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = soundgen:::convertStringToFormants('a'),
  temperature = 0.1, formantDepStoch = 30)))
# a schwa based on the length of vocal tract = 15.5 cm
image(t(getSpectralEnvelope(nr = 512, nc = 50, formants = NA,
  temperature = .1, vocalTract = 15.5)))

# no formants at all
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = NA, temperature = 0)))

# manual specification of formants
image(t(getSpectralEnvelope(nr = 512, nc = 50,
samplingRate = 16000, formants = list(
  'f1' = data.frame('time' = 0, 'freq' = 900, 'amp' = 30, 'width' = 120),
  'f2' = data.frame('time' = 0, 'freq' = 1300, 'amp' = 30, 'width' = 120),
  'f3' = data.frame('time' = 0, 'freq' = 3200, 'amp' = 20, 'width' = 200)))))