CalcANI: Auditory Nerve Image

View source: R/CalcANI.R

CalcANIR Documentation

Auditory Nerve Image

Description

This function calculates the auditory nerve image (ANI) from a sampled signal s_i \ (i=1,…,n) using an adapted version of Van Immerseel and Martens (1992) model of the auditory periphery. The signal s_i is transformed into a multi-channel signal that emulates the spreading of excitation across equally spaced frequency bands. Different filtering techniques and an envelope component extractor are used to obtain the neural firings; see Van Immerseel and Martens (1992). The output of the model is an auditory nerve image or primary image, a n x m matrix, with m the number of channels representing the instantaneous amplitude across all frequency subbands. The frequency range covered by the auditory filters depends on the centre frequencies of the filters, the number of channels and the distance between the channels. By default, we select 40 channels, half a critical band apart from each other. This setting covers a range from 140 Hz to 8877 Hz.

To let the auditory model process the acoustical signal, note that the function CalcANI resamples all signal inputs automatically to a sampling rate of 22050 Hz. The outcome of the auditory model has a sampling frequency of 11025 Hz, but for most calculations, this can be downsampled to a lower value (f_{A}=11025/4 is the default).

Usage

CalcANI(inSignal, inSampleFreq,
        inDownsamplingFactor = 4, inNumOfChannels = 40,
        inFirstCBU = 2.0, inCBUStep = 0.5)

Arguments

inSignal

the sound signal to be processed. It can either be a "WAVE" object (tuneR), a numeric vector or a "*.wav" file stored in the working wirectory.

inSampleFreq

the sample frequency of the input signal (in Hz).

inDownsamplingFactor

the integer factor by which the outcome of the auditory model is downsampled (use 1 for no downsampling). If empty or not specified, 4 is used by default.

inNumOfChannels

number of channels to use. If empty or not specified, 40 is used by default.

inFirstCBU

frequency of first channel (in critical band units). If empty or not specified, 2 is used by default.

inCBUStep

frequency difference between channels (in cbu). If empty or not specified, 0.5 is used by default.

Value

An object of class "AI", which is a list with the following elements:

AuditoryNerveImage

a matrix of size n x m representing the auditory nerve image, where n is the number time observations and m is the number of channels.

ANIFreq

sample frequency of ANI (in Hz).

ANIFilterFreqs

center frequencies used by the auditory model (in Hz).

Author(s)

Marc Vidal (R version). Based on the original code from IPEM Toolbox.

References

Van Immerseel, L. Van and Martens, J. (1992).Pitch and voiced/unvoiced determination with an auditory model. The Journal of the Acoustical Society of America, vol. 91, pp.3511-3526.

Ligges, U., Krey, S., Mersmann, O., Schnackenberg, S. (2018). tuneR: Analysis of Music and Speech. R package version 1.3.3. URL: https://CRAN.R-project.org/package=tuneR

Examples

data(SchumannKurioseGeschichte)
s <- SchumannKurioseGeschichte
ANIs <- CalcANI(s, 22050)

m-vidal/eaR documentation built on Nov. 18, 2022, 3:55 p.m.