layer_mel_spectrogram: Layer Mel-Spectrogram

Description Usage Arguments Details Input shape Output shape See Also Examples

Description

Reweights the spectrogram to the mel-scale. Uses tf$contrib$signal$linear_to_mel_weight_matrix to compute the matrix.

Usage

1
2
3
layer_mel_spectrogram(object, num_mel_bins = 128, sample_rate = 16000,
  lower_edge_hertz = 0, upper_edge_hertz = 7400, log_compress = TRUE,
  log_offset = 1e-06, name = NULL)

Arguments

num_mel_bins

How many bands in the resulting mel spectrum.

sample_rate

Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.

log_compress

(TRUE/FALSE) It is common practice to apply a compressive nonlinearity such as a logarithm or power-law compression to spectrograms. This helps to balance the importance of detail in low and high energy regions of the spectrum, which more closely matches human auditory sensitivity.

log_offset

When compressing with a logarithm, it's a good idea to use a stabilizing offset to avoid high dynamic ranges caused by the singularity at zero.

name

An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided.

lower_edge_hertz:

Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.

upper_edge_hertz:

The desired top edge of the highest frequency band.

Details

It only works with the TensorFlow backend.

Input shape

4D tensor with shape (a spectrogram): (samples, channels, frames, fft_unique_bins) if data_format='channels_first' or 4D tensor with shape: (samples, frames, fft_unique_bins, channels) if data_format='channels_last'.

Output shape

4D tensor with shape: (samples, frames, num_mel_bins, channels) if data_format='channels_last' or 4D tensor with shape: (samples, channels, frames, num_mel_bins) if data_format='channels_last'.

See Also

Other audio: layer_spectrogram

Examples

1
2
3
4
5
6
7
## Not run: 
library(keras)
library(kextra)
input <- layer_input(shape = c(16000, 1))
output <- layer_spectrogram(input, 100, 10) %>% layer_mel_spectrogram(10)

## End(Not run)

dfalbel/kextra documentation built on May 13, 2019, 3 a.m.