model_melresnet: MelResNet

model_melresnetR Documentation

MelResNet

Description

MelResNet layer uses a stack of ResBlocks on spectrogram. Pass the input through the MelResNet layer.

Usage

model_melresnet(
  n_res_block = 10,
  n_freq = 128,
  n_hidden = 128,
  n_output = 128,
  kernel_size = 5
)

Arguments

n_res_block

the number of ResBlock in stack. (Default: 10)

n_freq

the number of bins in a spectrogram. (Default: 128)

n_hidden

the number of hidden dimensions of resblock. (Default: 128)

n_output

the number of output dimensions of melresnet. (Default: 128)

kernel_size

the number of kernel size in the first Conv1d layer. (Default: 5)

Details

forward param: specgram (Tensor): the input sequence to the MelResNet layer (n_batch, n_freq, n_time).

Value

Tensor shape: (n_batch, n_output, n_time - kernel_size + 1)

Examples


if(torch::torch_is_installed()) {
 melresnet = model_melresnet()
 input = torch::torch_rand(10, 128, 512)  # a random spectrogram
 output = melresnet(input)  # shape: (10, 128, 508)
}


torchaudio documentation built on Feb. 16, 2023, 9:41 p.m.