RoughnessFFT | R Documentation |
This function estimates the roughness (or the sensory dissonance) of a sound, from its auditory nerve image calculated in CalcANI
.
Roughness is considered to be a sensory process highly related to texture perception. The visualization and calculation method is based on Leman (2000), where roughness is defined as the energy of the relevant beating frequencies in the auditory channels. The model is based on phase-locking to frequencies that are present in the neural patterns. A synchronization index allows the visualization of the energy components underlying roughness, in particular concerning the auditory channels and the phase-locking synchronization, that is, the synchronization index for the relevant beating frequencies on a frequency scale.
Beating frequencies are those at which the sound oscillates in amplitude. Thus, for an amplitude modulated sine wave with carrier frequency f_{c} and modulated frequency f_m, its spectrum has only three frequencies: f_{c}, f_{c}-f_{m} and f_{c}+f_{m}. The beating frequency is f_m. Now, in the auditory system, these frequencies are introduced as elective beating frequencies into the spectrum of the neural rate-code patterns. This is because to wave rectification in the cochlea, where the lower part of the modulated signal is cut off. As a result, the new frequencies are introduced of which the most important ones correspond with the beating frequency f_m and its multiples. Neurons may synchronize with these frequencies provided that they fall in the frequency range where synchronization is physiologically possible. This mechanism forms a physiological basis for the detection of beats and hence, the sensation of roughness.
The synchronization index model calculates roughness in terms of the energy of neural synchronization to the beating frequencies. The energy refers to a quantity which we derive from the magnitude spectrum. Since the beating frequencies are contained in the lower spectral area of a continuous neural auditory pattern a(t), the spectral part we are interested in is
b_j(t,ω)=F_j(t,ω)|d_{j}(t,ω)|,\quad(j=1,…,m),
where F(ω) is a filter and d_{j}(t,ω) is the short-term spectrum
d_{j}(t,ω)=\int_{-∞}^{+∞}a_{j}(t)w≤ft(t^{\prime}-t\right)e^{-2π i ω t^{\prime}}\mathrm{d}t^{\prime},
being w≤ft(t^{\prime}-t\right) a (hamming) window. The magnitude spectrum is then defined as |d_{j}(t, ω)| and the phase spectrum as \angle d_{j}(t, ω). In order to be able to reproduce the psychoacoustical data on roughness, the filters F_j(t,ω) should be more narrow at auditory channels where the center frequency is below 800 Hz, and the filters should be attenuated for high center frequencies as well.
The pattern b_{j}(t,ω) represents the spectrum of the neural synchronization to the beating frequencies in each channel. The synchronization index (Javel et al., 1988) of the beating frequencies is then defined as the normalized magnitude
ξ_j(t,ω)=≤ft|\frac{b_{j}(t,ω)}{d_{j}(t, 0)}\right|,
where ξ_{j}(t, ω) is the normalized magnitude and d_{j}(t,0) is the DC-component of the whole signal at each channel. The short-term energy spectrum of the neural synchronization to beating frequencies is defined by
\hat{b}_{j}(t,ω)=ξ_{j}(t,ω)^{α},
where (1<α<2) is a parameter which can be related to the power law (Leman, 2000). For the function setup α=1.6. We then define the roughness as
r_{\hat{b}}(t)=\int∑_{j=1}^{m}\hat{b}_{j}(t,ω)\mathrm{d}ω,
which is obtained by an integration of the energy over all frequencies, as well as over all channels. This definition implies a proper visualization, one along the axis of auditory channels and one along the axis of the (beating) frequencies.
RoughnessFFT(inObjANI, inFrameWidth = 0.2, inFrameStepSize = 0.02, alpha = 1.6)
inObjANI |
an object of class " |
inFrameWidth |
the width of the window for analysing the signal (in s). If empty or not specified, 0.2 s is used by default. |
inFrameStepSize |
the stepsize or time interval between two |
alpha |
a parameter to increase the synchronization index. |
For now, the roughness values are dependend on the used frame width. So, to make usefull comparisons, only results obtained using the same frame width should be compared.
outFFTMatrix1 |
visualisation of energy over channels. |
outFFTMatrix2 |
visualisation of energy spectrum for synchronization (synchronisation index SI). |
outRoughness |
roughness over the signal. |
outSampleFreq |
sampling rate of |
PlotRoughness |
the plot of the roughness over signal. |
Marc Vidal (R
version). Based on the original code from IPEM Toolbox.
Javel, E., McGee, J., Horst, J., & Farley, G. (1988). Temporal mechanisms in auditory stimulus coding. In G. Edelman, W. Gall, & W. Cowan (Eds.), Auditory function: neurobiological bases of hearing. New York: John Wiley and Sons.
Leman, M. (2000). Visualization and calculation of the roughness of acoustical musical signals using the synchronization index model (SIM). In: Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy.
data(SchumannKurioseGeschichte) s <- SchumannKurioseGeschichte ANIs <- CalcANI(s, 22050) Rs <- RoughnessFFT(ANIs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.