Description Usage Arguments Value Examples
Acoustic analysis of a single sound file: pitch tracking and basic spectral characteristics. The default values of arguments are optimized for human non-linguistic vocalizations. See the vignette on acoustic analysis for details.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | analyze(x, samplingRate = NULL, silence = 0.04, windowLength = 50,
step = NULL, overlap = 50, wn = "gaussian", zp = 0, cutFreq = 6000,
nFormants = 3, pitchMethods = c("autocor", "spec", "dom"),
entropyThres = 0.6, pitchFloor = 75, pitchCeiling = 3500,
priorMean = HzToSemitones(300), priorSD = 6, priorPlot = FALSE,
nCands = 1, minVoicedCands = "autom", domThres = 0.1, domSmooth = 220,
autocorThres = 0.7, autocorSmooth = NULL, cepThres = 0.3,
cepSmooth = NULL, cepZp = 0, specThres = 0.3, specPeak = 0.35,
specSinglePeakCert = 0.4, specHNRslope = 0.8, specSmooth = 150,
specMerge = 1, shortestSyl = 20, shortestPause = 60, interpolWin = 3,
interpolTol = 0.3, interpolCert = 0.3, pathfinding = c("none", "fast",
"slow")[2], annealPars = list(maxit = 5000, temp = 1000),
certWeight = 0.5, snakeStep = 0.05, snakePlot = FALSE, smooth = 1,
smoothVars = c("pitch", "dom"), summary = FALSE, plot = TRUE,
savePath = NA, specPlot = list(contrast = 0.2, brightness = 0, ylim = c(0,
5)), pitchPlot = list(col = rgb(0, 0, 1, 0.75), lwd = 3),
candPlot = list(levels = c("autocor", "spec", "dom", "cep"), col =
c("green", "red", "orange", "violet"), pch = c(16, 2, 3, 7), cex = 2))
|
x |
path to a .wav file or a vector of amplitudes with specified samplingRate |
samplingRate |
sampling rate of |
silence |
(0 to 1) frames with mean abs amplitude below silence threshold are not analyzed at all. NB: this number is dynamically updated: the actual silence threshold may be higher depending on the quietest frame, but it will never be lower than this specified number. |
windowLength |
length of FFT window, ms |
step |
you can override |
overlap |
overlap between successive FFT frames, % |
wn |
window type: gaussian, hanning, hamming, bartlett, rectangular, blackman, flattop |
zp |
window length after zero padding, points |
cutFreq |
(>0 to Nyquist, Hz) repeat the calculation of spectral
descriptives after discarding all info above |
nFormants |
the number of formants to extract per FFT frame. Calls
|
pitchMethods |
methods of pitch estimation to consider for determining pitch contour: 'autocor' = autocorrelation (~PRAAT), 'cep' = cepstral, 'spec' = spectral (~BaNa), 'dom' = lowest dominant frequency band |
entropyThres |
pitch tracking is not performed for frames with Weiner
entropy above |
pitchFloor, pitchCeiling |
absolute bounds for pitch candidates (Hz) |
priorMean, priorSD |
specifies the mean and sd of gamma distribution
describing our prior knowledge about the most likely pitch values for this
file. Specified in semitones: |
priorPlot |
if TRUE, produces a separate plot of the prior |
nCands |
maximum number of pitch candidates per method (except for
|
minVoicedCands |
minimum number of pitch candidates that
have to be defined to consider a frame voiced (defaults to 2 if |
domThres |
(0 to 1) to find the lowest dominant frequency band, we do short-term FFT and take the lowest frequency with amplitude at least domThres |
domSmooth |
the width of smoothing interval (Hz) for finding
|
autocorThres, cepThres, specThres |
(0 to 1) separate voicing thresholds for detecting pitch candidates with three different methods: autocorrelation, cepstrum, and BaNa algorithm (see Details). Note that HNR is calculated even for unvoiced frames. |
autocorSmooth |
the width of smoothing interval (in bins) for finding peaks in the autocorrelation function. Defaults to 7 for sampling rate 44100 and smaller odd numbers for lower values of sampling rate |
cepSmooth |
the width of smoothing interval (in bins) for finding peaks in the cepstrum. Defaults to 31 for sampling rate 44100 and smaller odd numbers for lower values of sampling rate |
cepZp |
zero-padding of the spectrum used for cepstral pitch detection (final length of spectrum after zero-padding in points, e.g. 2 ^ 13) |
specPeak, specHNRslope |
when looking for putative harmonics in
the spectrum, the threshold for peak detection is calculated as
|
specSinglePeakCert |
(0 to 1) if F0 is calculated based on a single
harmonic ratio (as opposed to several ratios converging on the same
candidate), its certainty is taken to be |
specSmooth |
the width of window for detecting peaks in the spectrum, Hz |
specMerge |
pitch candidates within |
shortestSyl |
the smallest length of a voiced segment (ms) that constitutes a voiced syllable (shorter segments will be replaced by NA, as if unvoiced) |
shortestPause |
the smallest gap between voiced syllables (ms) that means they shouldn't be merged into one voiced syllable |
interpolWin, interpolTol, interpolCert |
control the behavior of
interpolation algorithm when postprocessing pitch candidates. To turn off
interpolation, set |
pathfinding |
method of finding the optimal path through pitch
candidates: 'none' = best candidate per frame, 'fast' = simple heuristic,
'slow' = annealing. See |
annealPars |
a list of control parameters for postprocessing of
pitch contour with SANN algorithm of |
certWeight |
(0 to 1) in pitch postprocessing, specifies how much we prioritize the certainty of pitch candidates vs. pitch jumps / the internal tension of the resulting pitch curve |
snakeStep |
optimized path through pitch candidates is further
processed to minimize the elastic force acting on pitch contour. To
disable, set |
snakePlot |
if TRUE, plots the snake |
smooth, smoothVars |
if |
summary |
if TRUE, returns only a summary of the measured acoustic variables (mean, median and SD). If FALSE, returns a list containing frame-by-frame values |
plot |
if TRUE, produces a spectrogram with pitch contour overlaid |
savePath |
if a valid path is specified, a plot is saved in this folder (defaults to NA) |
specPlot |
a list of graphical parameters passed to
|
pitchPlot |
a list of graphical parameters for displaying the
final pitch contour. Set to |
candPlot |
a list of graphical parameters for displaying
individual pitch candidates. Set to |
If summary = TRUE
, returns a dataframe with one row and three
column per acoustic variable (mean / median / SD). If summary =
FALSE
, returns a dataframe with one row per FFT frame and one column per
acoustic variable. The best guess at the pitch contour considering all
available information is stored in the variable called "pitch". In
addition, the output contains a number of other acoustic descriptors and
pitch estimates by separate algorithms included in pitchMethods
. See
the vignette on acoustic analysis for a full explanation of returned
measures.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ## Not run:
sound1 = soundgen(sylLen = 900, pitchAnchors = list(
time = c(0, .3, .8, 1), value = c(300, 900, 400, 2300)),
noiseAnchors = list(time = c(0, 900), value = c(-40, 00)),
temperature = 0)
# playme(sound1, 16000)
a1 = analyze(sound1, samplingRate = 16000, plot = TRUE)
# or, to improve the quality of postprocessing:
a1 = analyze(sound1, samplingRate = 16000, plot = TRUE, pathfinding = 'slow')
median(a1$pitch, na.rm = TRUE) # 614 Hz
# (can vary, since postprocessing is stochastic)
# compare to the true value:
median(getSmoothContour(anchors = list(time = c(0, .3, .8, 1),
value = c(300, 900, 400, 2300)), len = 1000))
# the same pitch contour, but harder b/c of subharmonics and jitter
sound2 = soundgen(sylLen = 900, temperature = 0,
pitchAnchors = list(time = c(0, .3, .8, 1),
value = c(300, 900, 400, 2300)),
noiseAnchors = list(time = c(0, 900), value = c(-40, 20)),
subDep = 100, jitterDep = 0.5, pitchEffectsAmount = 100)
sound2 = soundgen(sylLen = 900, pitchAnchors = list(
time = c(0, .3, .8, 1), value = c(300, 900, 400, 2300)),
noiseAnchors = list(time = c(0, 900), value = c(-40, 20)),
subDep = 100, jitterDep = 0.5, pitchEffects_amount = 100, temperature = 0)
# playme(sound2, 16000)
a2 = analyze(sound2, samplingRate = 16000, plot = TRUE, pathfinding = 'slow')
# many candidates are off, but the overall contour should be mostly accurate
# Fancy plotting options:
a = analyze(sound2, samplingRate = 16000, plot = TRUE,
specPlot = list(xlab = 'Time, ms', colorTheme = 'seewave', contrast = .8),
candPlot = list(cex = 3, col = c('gray70', 'yellow', 'purple', 'maroon')),
pitchPlot = list(col = 'black', lty = 3, lwd = 3))
# Plot pitch candidates w/o a spectrogram
a = analyze(sound2, samplingRate = 16000, plot = TRUE, specPlot = NA)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.