# melfcc: MFCC Calculation

## Description

Calculate Mel-frequency cepstral coefficients.

## Usage

 ```1 2 3 4 5 6 7``` ```melfcc(samples, sr = samples@samp.rate, wintime = 0.025, hoptime = 0.01, numcep = 12, lifterexp = 0.6, htklifter = FALSE, sumpower = TRUE, preemph = 0.97, dither = FALSE, minfreq = 0, maxfreq = sr/2, nbands = 40, bwidth = 1, dcttype = c("t2", "t1", "t3", "t4"), fbtype = c("mel", "htkmel", "fcmel", "bark"), usecmp = FALSE, modelorder = NULL, spec_out = FALSE, frames_in_rows = TRUE) ```

## Arguments

 `samples` Object of Wave-class or WaveMC-class. Only the first channel will be used. `sr` Sampling rate of the signal. `wintime` Window length in sec. `hoptime` Step between successive windows in sec. `numcep` Number of cepstra to return. `lifterexp` Exponent for liftering; 0 = none. `htklifter` Use HTK sin lifter. `sumpower` If `sumpower = TRUE` the frequency scale transformation is based on the powerspectrum, if `sumpower = FALSE` it is based on its squareroot (absolute value of the spectrum) and squared afterwards. `preemph` Apply pre-emphasis filter [1 -preemph] (0 = none). `dither` Add offset to spectrum as if dither noise. `minfreq` Lowest band edge of mel filters (Hz). `maxfreq` Highest band edge of mel filters (Hz). `nbands` Number of warped spectral bands to use. `bwidth` Width of spectral bands in Bark/Mel. `dcttype` Type of DCT used - 1 or 2 (or 3 for HTK or 4 for feacalc). `fbtype` Auditory frequency scale to use: `"mel"`, `"bark"`, `"htkmel"`, `"fcmel"`. `usecmp` Apply equal-loudness weighting and cube-root compression (PLP instead of LPC). `modelorder` If `modelorder > 0`, fit a linear prediction (autoregressive-) model of this order and calculation of cepstra out of `lpcas`. `spec_out` Should matrices of the power- and the auditory-spectrum be returned. `frames_in_rows` Return time frames in rows instead of columns (original Matlab code).

## Details

Calculation of the MFCCs imlcudes the following steps:

1. Preemphasis filtering

2. Take the absolute value of the STFT (usage of Hamming window)

3. Warp to auditory frequency scale (Mel/Bark)

4. Take the DCT of the log-auditory-spectrum

5. Return the first ‘ncep’ components

## Value

 `cepstra ` Cepstral coefficients of the input signal (one time frame per row/column) `aspectrum ` Auditory spectrum (spectrum after transformation to Mel/Bark scale) of the signal `pspectrum ` Power spectrum of the input signal. `lpcas ` If `modelorder > 0`, the linear prediction coefficients (LPC/PLP).

## Note

The following non-default values nearly duplicate Malcolm Slaney's mfcc (i.e.

 ```1 2``` ```melfcc(d, 16000, wintime=0.016, lifterexp=0, minfreq=133.33, maxfreq=6855.6, sumpower=FALSE) ```

=~= `log(10) * 2 * mfcc(d, 16000)` in the Auditory toolbox for Matlab).

The following non-default values nearly duplicate HTK's MFCC (i.e.

 ```1 2``` ```melfcc(d, 16000, lifterexp=22, htklifter=TRUE, nbands=20, maxfreq=8000, sumpower=FALSE, fbtype="htkmel", dcttype="t3") ```

=~= `2 * htkmelfcc(:,[13,[1:12]])` where HTK config has ‘PREEMCOEF = 0.97’, ‘NUMCHANS = 20’, ‘CEPLIFTER = 22’, ‘NUMCEPS = 12’, ‘WINDOWSIZE = 250000.0’, ‘USEHAMMING = T’, ‘TARGETKIND = MFCC_0’).

For more detail on reproducing other programs' outputs, see http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/mfccs.html

## Author(s)

Sebastian Krey [email protected]

## References

Daniel P. W. Ellis: http://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/

## Examples

 ```1 2 3 4 5 6 7``` ``` testsound <- normalize(sine(400) + sine(1000) + square(250), "16") m1 <- melfcc(testsound) #Use PLP features to calculate cepstra and output the matrices like the #original Matlab code (note: modelorder limits the number of cepstra) m2 <- melfcc(testsound, numcep=9, usecmp=TRUE, modelorder=8, spec_out=TRUE, frames_in_rows=FALSE) ```

### Example output

```Warning message:
In normalize(sine(400) + sine(1000) + square(250), "16") :
pcm set to TRUE since unit was one of 8, 16, or 24
There were 50 or more warnings (use warnings() to see the first 50)
```

