View source: R/modulationSpectrum.R
modulationSpectrum  R Documentation 
Produces a modulation spectrum of waveform(s) or audio file(s), with temporal
modulation along the X axis (Hz) and spectral modulation (1/KHz) along the Y
axis. A good visual analogy is decomposing the spectrogram into a sum of
ripples of various frequencies and directions. Roughness is calculated as the
proportion of energy / amplitude of the modulation spectrum within
roughRange
of temporal modulation frequencies. The frequency of
amplitude modulation (amMsFreq, Hz) is calculated as the highest peak in the
smoothed AM function, and its purity (amMsPurity, dB) as the ratio of this
peak to the median AM over amRange
. For relatively short and steady
sounds, set amRes = NULL
and analyze the entire sound. For longer
sounds and when roughness or AM vary over time, set amRes
to get
multiple measurements over time (see examples).
modulationSpectrum( x, samplingRate = NULL, scale = NULL, from = NULL, to = NULL, amRes = 5, maxDur = 5, logSpec = FALSE, windowLength = 15, step = NULL, overlap = 80, wn = "hanning", zp = 0, power = 1, roughRange = c(30, 150), amRange = c(10, 200), returnMS = TRUE, returnComplex = FALSE, summaryFun = c("mean", "median", "sd"), averageMS = FALSE, reportEvery = NULL, cores = 1, plot = TRUE, savePlots = NULL, logWarp = NA, quantiles = c(0.5, 0.8, 0.9), kernelSize = 5, kernelSD = 0.5, colorTheme = c("bw", "seewave", "heat.colors", "...")[1], main = NULL, xlab = "Hz", ylab = "1/KHz", xlim = NULL, ylim = NULL, width = 900, height = 500, units = "px", res = NA, ... )
x 
path to a folder, one or more wav or mp3 files c('file1.wav', 'file2.mp3'), Wave object, numeric vector, or a list of Wave objects or numeric vectors 
samplingRate 
sampling rate of 
scale 
maximum possible amplitude of input used for normalization of
input vector (only needed if 
from, to 
if NULL (default), analyzes the whole sound, otherwise from...to (s) 
amRes 
target resolution of amplitude modulation, Hz. If 
maxDur 
sounds longer than 
logSpec 
if TRUE, the spectrogram is logtransformed prior to taking 2D FFT 
windowLength 
length of FFT window, ms 
step 
you can override 
overlap 
overlap between successive FFT frames, % 
wn 
window type accepted by 
zp 
window length after zero padding, points 
power 
raise modulation spectrum to this power (eg power = 2 for ^2, or "power spectrum") 
roughRange 
the range of temporal modulation frequencies that constitute the "roughness" zone, Hz 
amRange 
the range of temporal modulation frequencies that we are interested in as "amplitude modulation" (AM), Hz 
returnMS 
if FALSE, only roughness is returned (much faster) 
returnComplex 
if TRUE, returns a complex modulation spectrum (without normalization and warping) 
summaryFun 
functions used to summarize each acoustic characteristic, eg "c('mean', 'sd')"; userdefined functions are fine (see examples); NAs are omitted automatically for mean/median/sd/min/max/range/sum, otherwise take care of NAs yourself 
averageMS 
if TRUE, the modulation spectra of all inputs are averaged into a single output; if FALSE, a separate MS is returned for each input 
reportEvery 
when processing multiple inputs, report estimated time left every ... iterations (NULL = default, NA = don't report) 
cores 
number of cores for parallel processing 
plot 
if TRUE, plots the modulation spectrum of each sound 
savePlots 
if a valid path is specified, a plot is saved in this folder (defaults to NA) 
logWarp 
the base of log for warping the modulation spectrum (ie log2 if logWarp = 2); set to NULL or NA if you don't want to logwarp 
quantiles 
labeled contour values, % (e.g., "50" marks regions that contain 50% of the sum total of the entire modulation spectrum) 
kernelSize 
the size of Gaussian kernel used for smoothing (1 = no smoothing) 
kernelSD 
the SD of Gaussian kernel used for smoothing, relative to its size 
colorTheme 
black and white ('bw'), as in seewave package ('seewave'),
or any palette from 
xlab, ylab, main, xlim, ylim 
graphical parameters 
width, height, units, res 
parameters passed to

... 
other graphical parameters passed on to 
Algorithm: prepare a spectrogram, take its logarithm (if logSpec =
TRUE
), center, perform a 2D Fourier transform (see also
spectral::spec.fft()), take the upper half of the resulting symmetric matrix,
and raise it to power
. The result is returned as $original
. For
plotting purposes, the modulation matrix can be smoothed with Gaussian blur
(see gaussianSmooth2D
) and logwarped (if logWarp
is a
positive number). This processed modulation spectrum is returned as
$processed
. If the audio is long enough, multiple windows are
analyzed, resulting in a vector of roughness values. For multiple inputs,
such as a list of waveforms or path to a folder with audio files, the
ensemble of modulation spectra can be interpolated to the same spectral and
temporal resolution and averaged (if averageMS
).
Returns a list with the following components:
$original
modulation spectrum prior to blurring and logwarping,
but after squaring if power = TRUE
, a matrix of nonnegative values.
Rownames are spectral modulation frequencies (cycles/KHz), and colnames are
temporal modulation frequencies (Hz).
$processed
modulation spectrum after blurring and logwarping
$complex
untransformed complex modulation spectrum (returned
only if returnComplex = TRUE)
$roughness
proportion of energy / amplitude of the modulation
spectrum within roughRange
of temporal modulation frequencies, %  a
vector if amRes is numeric and the sound is long enough, a single number
otherwise
$amMsFreq
frequency of the highest peak, within amRange
, of
the folded AM function (average AM across all FM bins for both negative and
positive AM frequencies), where a peak is a local maximum over amRes
Hz. Like roughness
, amMsFreq
and amMsPurity
can be single
numbers or vectors, depending on whether the sound is analyzed as a whole or
in chunks
$amMsPurity
ratio of the peak at amMsFreq to the median AM over
amRange
, dB
$summary
dataframe with summaries of roughness, amMsFreq, and
amMsPurity
Singh, N. C., & Theunissen, F. E. (2003). Modulation spectra of natural sounds and ethological theories of auditory processing. The Journal of the Acoustical Society of America, 114(6), 33943411.
spectrogram
analyze
# White noise ms = modulationSpectrum(runif(16000), samplingRate = 16000, logSpec = FALSE, power = TRUE, amRes = NULL) # analyze the entire sound, giving a single roughness value str(ms) # Harmonic sound s = soundgen(amMsFreq = 25, amMsPurity = 50) ms = modulationSpectrum(s, samplingRate = 16000, amRes = NULL) ms[c('roughness', 'amMsFreq', 'amMsPurity')] # a single value for each ms1 = modulationSpectrum(s, samplingRate = 16000, amRes = 5) ms1[c('roughness', 'amMsFreq', 'amMsPurity')] # measured over time (low values of amRes mean more precision, so we analyze # longer segments and get fewer values per sound) # Embellish ms = modulationSpectrum(s, samplingRate = 16000, xlab = 'Temporal modulation, Hz', ylab = 'Spectral modulation, 1/KHz', colorTheme = 'heat.colors', main = 'Modulation spectrum', lty = 3) ## Not run: # A long sound with varying AM and a bit of chaos at the end s_long = soundgen(sylLen = 1500, pitch = c(250, 320, 280), amMsFreq = c(30, 55), amMsPurity = c(20, 60, 40), jitterDep = c(0, 0, 2)) playme(s_long) ms = modulationSpectrum(s_long, 16000) # plot AM over time plot(x = seq(1, 1500, length.out = length(ms$amMsFreq)), y = ms$amMsFreq, cex = 10^(ms$amMsPurity/20) * 10, xlab = 'Time, ms', ylab = 'AM frequency, Hz') # plot roughness over time spectrogram(s_long, 16000, ylim = c(0, 4), extraContour = list(ms$roughness / max(ms$roughness) * 4000, col = 'blue')) # As with spectrograms, there is a tradeoff in timefrequency resolution s = soundgen(pitch = 500, amMsFreq = 50, amMsPurity = 100, samplingRate = 44100) # playme(s, samplingRate = 44100) ms = modulationSpectrum(s, samplingRate = 44100, windowLength = 50, step = 50, amRes = NULL) # poor temporal resolution ms = modulationSpectrum(s, samplingRate = 44100, windowLength = 5, step = 1, amRes = NULL) # poor frequency resolution ms = modulationSpectrum(s, samplingRate = 44100, windowLength = 15, step = 3, amRes = NULL) # a reasonable compromise # customize the plot ms = modulationSpectrum(s, samplingRate = 44100, windowLength = 15, overlap = 80, amRes = NULL, kernelSize = 17, # more smoothing xlim = c(70, 70), ylim = c(0, 4), # zoom in on the central region quantiles = c(.25, .5, .8), # customize contour lines colorTheme = 'heat.colors', # alternative palette power = 2) # ^2 # Note the peaks at FM = 2/KHz (from "pitch = 500") and AM = 50 Hz (from # "amMsFreq = 50") # Input can be a wav/mp3 file ms = modulationSpectrum('~/Downloads/temp/200_ut_fearbungee_11.wav') # Input can be path to folder with audio files. Each file is processed # separately, and the output can contain an MS per file... ms1 = modulationSpectrum('~/Downloads/temp', kernelSize = 11, plot = FALSE, averageMS = FALSE) ms1$summary names(ms1$original) # a separate MS per file # ...or a single MS can be calculated: ms2 = modulationSpectrum('~/Downloads/temp', kernelSize = 11, plot = FALSE, averageMS = TRUE) image(t(ms2$original)) ms2$summary # Input can also be a list of waveforms (numeric vectors) ss = vector('list', 10) for (i in 1:length(ss)) { ss[[i]] = soundgen(sylLen = runif(1, 100, 1000), temperature = .4, pitch = runif(3, 400, 600)) } # lapply(ss, playme) # MS of the first sound ms1 = modulationSpectrum(ss[[1]], samplingRate = 16000, scale = 1) # average MS of all 10 sounds ms2 = modulationSpectrum(ss, samplingRate = 16000, scale = 1, averageMS = TRUE) # A sound with ~3 syllables per second and only downsweeps in F0 contour s = soundgen(nSyl = 8, sylLen = 200, pauseLen = 100, pitch = c(300, 200)) # playme(s) ms = modulationSpectrum(s, samplingRate = 16000, maxDur = .5, xlim = c(25, 25), colorTheme = 'seewave', power = 2) # note the asymmetry b/c of downsweeps # "power = 2" returns squared modulation spectrum  note that this affects # the roughness measure! ms$roughness # compare: modulationSpectrum(s, samplingRate = 16000, maxDur = .5, xlim = c(25, 25), colorTheme = 'seewave', logWarp = NULL, power = 1)$roughness # much higher roughness # Plotting with or without logwarping the modulation spectrum: ms = modulationSpectrum(soundgen(), samplingRate = 16000, logWarp = NA, plot = TRUE) ms = modulationSpectrum(soundgen(), samplingRate = 16000, logWarp = 2, plot = TRUE) # logWarp and kernelSize have no effect on roughness # because it is calculated before these transforms: modulationSpectrum(s, samplingRate = 16000, logWarp = 5)$roughness modulationSpectrum(s, samplingRate = 16000, logWarp = NA)$roughness modulationSpectrum(s, samplingRate = 16000, kernelSize = 17)$roughness # Logtransform the spectrogram prior to 2D FFT (affects roughness): ms = modulationSpectrum(soundgen(), samplingRate = 16000, logSpec = FALSE) ms = modulationSpectrum(soundgen(), samplingRate = 16000, logSpec = TRUE) # Complex modulation spectrum with phase preserved ms = modulationSpectrum(soundgen(), samplingRate = 16000, returnComplex = TRUE) image(t(log(abs(ms$complex)))) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.