getSpectralEnvelope: Spectral envelope

Description Usage Arguments Value Examples

View source: R/sourceSpectrum.R

Description

Prepares a spectral envelope for filtering a sound to add formants, lip radiation, and some stochastic component regulated by temperature. Formants are specified as a list containing time, frequency, amplitude, and width values for each formant (see examples). NB: each formant is generated as a gamma distribution with mean = freq and SD = width. Formant bandwidths in soundgen are therefore NOT compatible with formant bandwidths used in Klatt synthesizer and other algorithms that rely on FIR instead of FFT.

Usage

1
2
3
4
5
6
7
getSpectralEnvelope(nr, nc, formants = NA, formantDep = 1, rolloffLip = 6,
  mouthAnchors = NA, mouthOpenThres = 0, openMouthBoost = 0,
  vocalTract = NULL, temperature = 0, formDrift = 0.3, formDisp = 0.2,
  formantDepStoch = 30, smoothLinearFactor = 1, samplingRate = 16000,
  speedSound = 35400, plot = FALSE, duration = NULL,
  colorTheme = c("bw", "seewave", "...")[1], nCols = 100, xlab = "Time",
  ylab = "Frequency, kHz", ...)

Arguments

nr

the number of frequency bins = windowLength_points/2, where windowLength_points is the size of window for Fourier transform

nc

the number of time steps for Fourier transform

formants

either a character string like "aaui" referring to default presets for speaker "M1" or a list of formant times, frequencies, amplitudes, and bandwidths. formants = NA defaults to schwa. Time stamps for formants and mouthOpening can be specified in ms or an any other arbitarary scale.

formantDep

scale factor of formant amplitude (1 = no change relative to amplitudes in formants)

rolloffLip

the effect of lip radiation on source spectrum, dB/oct (the default of +6 dB/oct produces a high-frequency boost when the mouth is open)

mouthAnchors

a numeric vector of mouth opening (0 to 1, 0.5 = neutral, i.e. no modification) or a dataframe specifying the time (ms) and value of mouth opening

mouthOpenThres

the mouth is considered to be open when its opening is greater than mouthOpenThres. Defaults to 0

openMouthBoost

amplify the voice when the mouth is open by openMouthBoost dB

vocalTract

the length of vocal tract, cm. Used for calculating formant dispersion (for adding extra formants) and formant transitions as the mouth opens and closes

temperature

hyperparameter for regulating the amount of stochasticity in sound generation

formDrift

scale factor regulating the effect of temperature on the depth of random drift of all formants (user-defined and stochastic): the higher, the more formants drift at a given temperature

formDisp

scale factor regulating the effect of temperature on the irregularity of the dispersion of stochastic formants: the higher, the more unevenly stochastic formants are spaced at a given temperature

formantDepStoch

the amplitude of additional formants added above the highest specified formant (only if temperature > 0)

smoothLinearFactor

regulates smoothing of formant anchors (0 to +Inf) as they are upsampled to the number of fft steps nc. This is necessary because the input formants normally contains fewer sets of formant values than the number of fft steps. smoothLinearFactor = 0: close to default spline; >3: approaches linear extrapolation

samplingRate

sampling frequency, Hz

speedSound

speed of sound in warm air, cm/s. Stevens (2000) "Acoustic phonetics", p. 138

plot

if TRUE, produces a plot of the spectral envelope

duration

duration of the sound, ms (for plotting purposes only)

colorTheme

black and white ('bw'), as in seewave package ('seewave'), or another color theme (e.g. 'heat.colors')

nCols

number of colors in the palette

xlab, ylab

labels of axes

...

other graphical parameters passed on to image()

Value

Returns a spectral filter (matrix nr x nc, where nr is the number of frequency bins = windowLength_points/2 and nc is the number of time steps)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# [a] with F1-F3 visible
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = soundgen:::convertStringToFormants('a'),
  temperature = 0)))
# some "wiggling" of specified formants plus extra formants on top
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = soundgen:::convertStringToFormants('a'),
  temperature = 0.1, formantDepStoch = 10)))
# stronger extra formants
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = soundgen:::convertStringToFormants('a'),
  temperature = 0.1, formantDepStoch = 30)))
# a schwa based on the length of vocal tract = 15.5 cm
image(t(getSpectralEnvelope(nr = 512, nc = 50, formants = NA,
  temperature = .1, vocalTract = 15.5)))

# no formants at all
image(t(getSpectralEnvelope(nr = 512, nc = 50,
  formants = NA, temperature = 0)))

# manual specification of formants
image(t(getSpectralEnvelope(nr = 512, nc = 50,
samplingRate = 16000, formants = list(
  'f1' = data.frame('time' = 0, 'freq' = 900, 'amp' = 30, 'width' = 120),
  'f2' = data.frame('time' = 0, 'freq' = 1300, 'amp' = 30, 'width' = 120),
  'f3' = data.frame('time' = 0, 'freq' = 3200, 'amp' = 20, 'width' = 200)))))

tatters/soundgen_beta documentation built on May 14, 2019, 9 a.m.