layer_spectrogram: Layer Spectrogram

Description Usage Arguments Details Input shape Output shape See Also Examples

Description

Computes the spectrogram of a signal using the STFT implemented in tf.contrib.signal.

Usage

1
2
3
layer_spectrogram(object, frame_length, frame_step, fft_length = NULL,
  pad_end = FALSE, mode = "power", log_compress = FALSE,
  log_offset = 1e-06, name = NULL)

Arguments

object

Model or layer object

frame_length

The window length in samples.

frame_step

The number of samples to step.

fft_length

The size of the FFT to apply. If not provided, uses the smallest power of 2 enclosing frame_length.

pad_end

Whether to pad the end of signals with zeros when the provided frame length and step produces a frame that lies partially past its end.

mode

The mode of the spectrogram. Options are 'complex', 'power' or 'magnitude'. The 'power' spectrogram is the squared magnitude of the complex-valued STFT. A 'maginitude' spectrogram is the magnitude of the complex-valued STFT. 'complex' returns the output of stft.

log_compress

(TRUE/FALSE) It is common practice to apply a compressive nonlinearity such as a logarithm or power-law compression to spectrograms. This helps to balance the importance of detail in low and high energy regions of the spectrum, which more closely matches human auditory sensitivity.

log_offset

When compressing with a logarithm, it's a good idea to use a stabilizing offset to avoid high dynamic ranges caused by the singularity at zero.

name

An optional name string for the layer. Should be unique in a model (do not reuse the same name twice). It will be autogenerated if it isn't provided.

Details

It only works with the TensorFlow backend.

Input shape

3D tensor with shape: (samples, channels, audio_samples) if data_format='channels_first' or 3D tensor with shape: (samples, audio_samples, channels) if data_format='channels_last'.

Output shape

4D tensor with shape: (samples, frames, fft_unique_bins, channels) if data_format='channels_last' or 4D tensor with shape: (samples, channels, frames, fft_unique_bins) if data_format='channels_last'.

See Also

Other audio: layer_mel_spectrogram

Examples

1
2
3
4
5
6
7
## Not run: 
library(keras)
library(kextra)
input <- layer_input(shape = c(16000, 1))
output <- layer_spectrogram(input, 100, 10)

## End(Not run)

dfalbel/kextra documentation built on May 13, 2019, 3 a.m.