model_wavernn | R Documentation |
WaveRNN model based on the implementation from fatchord. The original implementation was introduced in "Efficient Neural Audio Synthesis". #' Pass the input through the WaveRNN model.
model_wavernn( upsample_scales, n_classes, hop_length, n_res_block = 10, n_rnn = 512, n_fc = 512, kernel_size = 5, n_freq = 128, n_hidden = 128, n_output = 128 )
upsample_scales |
the list of upsample scales. |
n_classes |
the number of output classes. |
hop_length |
the number of samples between the starts of consecutive frames. |
n_res_block |
the number of ResBlock in stack. (Default: |
n_rnn |
the dimension of RNN layer. (Default: |
n_fc |
the dimension of fully connected layer. (Default: |
kernel_size |
the number of kernel size in the first Conv1d layer. (Default: |
n_freq |
the number of bins in a spectrogram. (Default: |
n_hidden |
the number of hidden dimensions of resblock. (Default: |
n_output |
the number of output dimensions of melresnet. (Default: |
forward param:
waveform the input waveform to the WaveRNN layer (n_batch, 1, (n_time - kernel_size + 1) * hop_length)
specgram the input spectrogram to the WaveRNN layer (n_batch, 1, n_freq, n_time)
The input channels of waveform and spectrogram have to be 1. The product of
upsample_scales
must equal hop_length
.
Tensor shape: (n_batch, 1, (n_time - kernel_size + 1) * hop_length, n_classes)
if(torch::torch_is_installed()) { wavernn <- model_wavernn(upsample_scales=c(2,2,3), n_classes=5, hop_length=12) waveform <- torch::torch_rand(3,1,(10 - 5 + 1)*12) spectrogram <- torch::torch_rand(3,1,128,10) # waveform shape: (n_batch, n_channel, (n_time - kernel_size + 1) * hop_length) output <- wavernn(waveform, spectrogram) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.