generateHarmonics: Generate harmonics
In soundgen: Sound Synthesis and Acoustic Analysis

generateHarmonics

R Documentation

Generate harmonics

Description

Internal soundgen function.

Usage

generateHarmonics(
  pitch,
  glottis = 0,
  attackLen = 50,
  nonlinBalance = 0,
  nonlinDep = 50,
  nonlinRandomWalk = NULL,
  jitterDep = 0,
  jitterLen = 1,
  vibratoFreq = 5,
  vibratoDep = 0,
  shimmerDep = 0,
  shimmerLen = 1,
  rolloff = -9,
  rolloffOct = 0,
  rolloffKHz = 0,
  rolloffParab = 0,
  rolloffParabHarm = 3,
  rolloff_perAmpl = 0,
  rolloffExact = NULL,
  formantLocking = NULL,
  specEnv = NULL,
  formantSummary = NULL,
  temperature = 0.025,
  pitchDriftDep = 0.5,
  pitchDriftFreq = 0.125,
  amplDriftDep = 1,
  subDriftDep = 4,
  rolloffDriftDep = 3,
  randomWalk_trendStrength = 0.1,
  shortestEpoch = 300,
  subRatio = 1,
  subFreq = 100,
  subDep = 0,
  subWidth = 10000,
  ampl = NA,
  normalize = TRUE,
  smoothing = list(),
  overlap = 75,
  samplingRate = 16000,
  pitchFloor = 75,
  pitchCeiling = 3500,
  pitchSamplingRate = 3500,
  dynamicRange = 80
)

Arguments

`pitch`	a contour of fundamental frequency (numeric vector). NB: for computational efficiency, provide the pitch contour at a reduced sampling rate pitchSamplingRate, eg 3500 points/s. The pitch contour will be upsampled before synthesis.
`glottis`	anchors for specifying the proportion of a glottal cycle with closed glottis, % (0 = no modification, 100 = closed phase as long as open phase); numeric vector or dataframe specifying time and value (anchor format)
`attackLen`	duration of fade-in / fade-out at each end of syllables and noise (ms): a vector of length 1 (symmetric) or 2 (separately for fade-in and fade-out)
`nonlinBalance`	hyperparameter for regulating the (approximate) proportion of sound with different regimes of pitch effects (none / subharmonics only / subharmonics and jitter). 0% = no noise; 100% = the entire sound has jitter + subharmonics. Ignored if temperature = 0
`nonlinRandomWalk`	a numeric vector specifying the timing of nonliner regimes: 0 = none, 1 = subharmonics, 2 = subharmonics + jitter + shimmer
`jitterDep`	cycle-to-cycle random pitch variation, semitones (anchor format)
`jitterLen`	duration of stable periods between pitch jumps, ms. Use a low value for harsh noise, a high value for irregular vibrato or shaky voice (anchor format)
`vibratoFreq`	the rate of regular pitch modulation, or vibrato, Hz (anchor format)
`vibratoDep`	the depth of vibrato, semitones (anchor format)
`shimmerDep`	random variation in amplitude between individual glottal cycles (0 to 100% of original amplitude of each cycle) (anchor format)
`shimmerLen`	duration of stable periods between amplitude jumps, ms. Use a low value for harsh noise, a high value for shaky voice (anchor format)
`rolloff`	basic rolloff from lower to upper harmonics, db/octave (exponential decay). All rolloff parameters are in anchor format. See `getRolloff` for more details
`rolloffOct`	basic rolloff changes from lower to upper harmonics (regardless of f0) by `rolloffOct` dB/oct. For example, we can get steeper rolloff in the upper part of the spectrum
`rolloffKHz`	rolloff changes linearly with f0 by `rolloffKHz` dB/kHz. For ex., -6 dB/kHz gives a 6 dB steeper basic rolloff as f0 goes up by 1000 Hz
`rolloffParab`	an optional quadratic term affecting only the first `rolloffParabHarm` harmonics. The middle harmonic of the first `rolloffParabHarm` harmonics is amplified or dampened by `rolloffParab` dB relative to the basic exponential decay
`rolloffParabHarm`	the number of harmonics affected by `rolloffParab`
`rolloff_perAmpl`	as amplitude goes down from max to `-dynamicRange`, `rolloff` increases by `rolloff_perAmpl` dB/octave. The effect is to make loud parts brighter by increasing energy in higher frequencies
`rolloffExact`	user-specified exact strength of harmonics: a vector or matrix with one row per harmonic, scale 0 to 1 (overrides all other rolloff parameters)
`formantLocking`	the approximate proportion of sound in which one of the harmonics is locked to the nearest formant, 0 = none, 1 = the entire sound (anchor format)
`specEnv`	a matrix representing the filter (only needed for formant locking)
`temperature`	hyperparameter for regulating the amount of stochasticity in sound generation
`pitchDriftDep`	scale factor regulating the effect of temperature on the amount of slow random drift of f0 (like jitter, but slower): the higher, the more f0 "wiggles" at a given temperature
`pitchDriftFreq`	scale factor regulating the effect of temperature on the frequency of random drift of f0 (like jitter, but slower): the higher, the faster f0 "wiggles" at a given temperature
`amplDriftDep`	drift of amplitude mirroring pitch drift
`subDriftDep`	drift of subharmonic frequency and bandwidth mirroring pitch drift
`rolloffDriftDep`	drift of rolloff mirroring pitch drift
`randomWalk_trendStrength`	try 0 to 1 - the higher, the more likely rw is to get high in the middle and low at the beginning and end (i.e. max effect amplitude in the middle of a sound)
`shortestEpoch`	minimum duration of each epoch with unchanging subharmonics regime or formant locking, in ms
`subRatio`	a positive integer giving the ratio of f0 (the main fundamental) to g0 (a lower frequency): 1 = no subharmonics, 2 = period doubling regardless of pitch changes, 3 = period tripling, etc; subRatio overrides subFreq (anchor format)
`subFreq`	instead of a specific number of subharmonics (subRatio), we can specify the approximate g0 frequency (Hz), which is used only if subRatio = 1 and is adjusted to f0 so f0/g0 is always an integer (anchor format)
`subDep`	the depth of subharmonics relative to the main frequency component (f0), %. 0: no subharmonics; 100: g0 harmonics are as strong as the nearest f0 harmonic (anchor format)
`subWidth`	Width of subharmonic sidebands - regulates how rapidly g-harmonics weaken away from f-harmonics: large values like the default 10000 means that all g0 harmonics are equally strong (anchor format)
`ampl`	amplitude envelope (dB, 0 = max amplitude) (anchor format)
`normalize`	if TRUE, normalizes to -1...+1 prior to applying attack and amplitude envelope. W/o this, sounds with stronger harmonics are louder
`smoothing`	a list of parameters passed to `getSmoothContour` to control the interpolation and smoothing of contours: interpol (approx / spline / loess), loessSpan, discontThres, jumpThres
`overlap`	FFT window overlap, %. For allowed values, see `istft`
`samplingRate`	sampling frequency, Hz
`pitchFloor`, `pitchCeiling`	lower & upper bounds of f0
`pitchSamplingRate`	sampling frequency of the pitch contour only, Hz. Low values reduce processing time. Set to `pitchCeiling` for optimal speed or to `samplingRate` for optimal quality
`dynamicRange`	dynamic range, dB. Harmonics and noise more than dynamicRange under maximum amplitude are discarded to save computational resources

Details

Returns one continuous, unfiltered, voiced syllable consisting of several sine waves.

Examples

rolloffExact1 = c(.2, .2, 1, .2, .2)
s1 = soundgen:::generateHarmonics(pitch = seq(400, 530, length.out = 1500),
                       rolloffExact = rolloffExact1)
spectrogram(s1, 16000, ylim = c(0, 4))
# playme(s1, 16000)

rolloffExact2 = matrix(c(.2, .2, 1, .2, .2,
                         1, .5, .2, .1, .05), ncol = 2)
s2 = soundgen:::generateHarmonics(pitch = seq(400, 530, length.out = 1500),
                       rolloffExact = rolloffExact2)
spectrogram(s2, 16000, ylim = c(0, 4))
# playme(s2, 16000)

soundgen documentation built on April 4, 2025, 3:44 a.m.