estimateVTL: Estimate vocal tract length
In soundgen: Sound Synthesis and Acoustic Analysis

estimateVTL

R Documentation

Estimate vocal tract length

Description

Estimates the length of vocal tract based on formant frequencies. If method = 'meanFormant', vocal tract length (VTL) is calculated separately for each formant, and then the resulting VTLs are averaged. The equation used is (2 * formant_number - 1) * speedSound / (4 * formant_frequency) for a closed-open tube (mouth open) and formant_number * speedSound / (2 * formant_frequency) for an open-open or closed-closed tube (eg closed mouth in mmm or open mouth and open glottis in whispering). If method = 'meanDispersion', formant dispersion is calculated as the mean distance between formants, and then VTL is calculated as speed of sound / 2 / formant dispersion. If method = 'regression', formant dispersion is estimated using the regression method described in Reby et al. (2005) "Red deer stags use formants as assessment cues during intrasexual agonistic interactions". For a review of these and other VTL-related summary measures of formant frequencies, refer to Pisanski et al. (2014) "Vocal indicators of body size in men and women: a meta-analysis". See also schwa for VTL estimation with additional information on formant frequencies.

Usage

estimateVTL(
  formants,
  method = c("regression", "meanDispersion", "meanFormant")[1],
  interceptZero = TRUE,
  tube = c("closed-open", "open-open")[1],
  speedSound = 35400,
  checkFormat = TRUE,
  output = c("simple", "detailed")[1],
  plot = FALSE
)

Arguments

`formants`	formant frequencies in any format recognized by `soundgen`: a vector of formant frequencies like `c(550, 1600, 3200)`; a list with multiple values per formant like `list(f1 = c(500, 550), f2 = 1200))`; or a character string like `aaui` referring to default presets for speaker "M1" in soundgen presets
`method`	the method of estimating vocal tract length (see details)
`interceptZero`	if TRUE, forces the regression curve to pass through the origin. This reduces the influence of highly variable lower formants, but we have to commit to a particular model of the vocal tract: closed-open or open-open/closed-closed (method = "regression" only)
`tube`	the vocal tract is assumed to be a cylindrical tube that is either "closed-open" or "open-open" (same as closed-closed)
`speedSound`	speed of sound in warm air, by default 35400 cm/s. Stevens (2000) "Acoustic phonetics", p. 138
`checkFormat`	if FALSE, only a list of properly formatted formant frequencies is accepted
`output`	"simple" (default) = just the VTL; "detailed" = a list of additional stats (see Value below)
`plot`	if TRUE, plots the regression line whose slope gives formant dispersion (method = "regression" only). Label sizes show the influence of each formant, and the blue line corresponds to each formant being an integer multiple of F1 (as when harmonics are misidentified as formants); the second plot shows how VTL varies depending on the number of formants used

Value

If output = 'simple' (default), returns the estimated vocal tract length in cm. If output = 'detailed' and method = 'regression', returns a list with extra stats used for plotting. Namely, $regressionInfo$infl gives the influence of each observation calculated as the absolute change in VTL with vs without the observation * 10 + 1 (the size of labels on the first plot). $vtlPerFormant$vtl gives the VTL as it would be estimated if only the first nFormants were used.

Examples

estimateVTL(NA)
estimateVTL(500)
estimateVTL(c(600, 1850, 2800, 3600, 5000), plot = TRUE)
estimateVTL(c(600, 1850, 2800, 3600, 5000), plot = TRUE, output = 'detailed')
estimateVTL(c(1200, 2000, 2800, 3800, 5400, 6400),
  tube = 'open-open', interceptZero = FALSE, plot = TRUE)
estimateVTL(c(1200, 2000, 2800, 3800, 5400, 6400),
  tube = 'open-open', interceptZero = TRUE, plot = TRUE)

# Multiple measurements are OK
estimateVTL(
  formants = list(f1 = c(540, 600, 550),
  f2 = 1650, f3 = c(2400, 2550)),
  plot = TRUE, output = 'detailed')
# NB: this is better than averaging formant values. Cf.:
estimateVTL(
  formants = list(f1 = mean(c(540, 600, 550)),
  f2 = 1650, f3 = mean(c(2400, 2550))),
  plot = TRUE)

# Missing values are OK
estimateVTL(c(600, 1850, 3100, NA, 5000), plot = TRUE)
estimateVTL(list(f1 = 500, f2 = c(1650, NA, 1400), f3 = 2700), plot = TRUE)

# Note that VTL estimates based on the commonly reported 'meanDispersion'
# depend only on the first and last formants
estimateVTL(c(500, 1400, 2800, 4100), method = 'meanDispersion')
estimateVTL(c(500, 1100, 2300, 4100), method = 'meanDispersion') # identical
# ...but this is not the case for 'meanFormant' and 'regression' methods
estimateVTL(c(500, 1400, 2800, 4100), method = 'meanFormant')
estimateVTL(c(500, 1100, 2300, 4100), method = 'meanFormant') # much longer

## Not run: 
# Compare the results produced by the three methods
nIter = 1000
out = data.frame(meanFormant = rep(NA, nIter), meanDispersion = NA, regression = NA)
for (i in 1:nIter) {
  # generate a random formant configuration
  f = runif(1, 300, 900) + (1:6) * rnorm(6, 1000, 200)
  out$meanFormant[i]    = estimateVTL(f, method = 'meanFormant')
  out$meanDispersion[i] = estimateVTL(f, method = 'meanDispersion')
  out$regression[i]     = estimateVTL(f, method = 'regression')
}
pairs(out)
cor(out)
# 'meanDispersion' is pretty different, while 'meanFormant' and 'regression'
# give broadly comparable results

## End(Not run)

soundgen documentation built on April 4, 2025, 3:44 a.m.