knitr::opts_chunk$set(echo = TRUE, eval = FALSE)
First, we need to install fastaudio module
.
reticulate::py_install('fastaudio',pip = TRUE)
Grab data from TensorFlow Speech Commands (2.3 GB):
commands_path = "SPEECHCOMMANDS" audio_files = get_audio_files(commands_path) length(audio_files$items) # [1] 105835
Prepare dataset and put into data loader:
DBMelSpec = SpectrogramTransformer(mel=TRUE, to_db=TRUE) a2s = DBMelSpec() crop_4000ms = ResizeSignal(4000) tfms = list(crop_4000ms, a2s)
auds = DataBlock(blocks = list(AudioBlock(), CategoryBlock()), get_items = get_audio_files, splitter = RandomSplitter(), item_tfms = tfms, get_y = parent_label) audio_dbunch = auds %>% dataloaders(commands_path, item_tfms = tfms, bs = 20)
See batch:
audio_dbunch %>% show_batch(figsize = c(15, 8.5), nrows = 3, ncols = 3, max_n = 9, dpi = 180)
Before fitting, 3 channels to 1 channel:
torch = torch() nn = nn() learn = Learner(dls, xresnet18(pretrained = FALSE), nn$CrossEntropyLoss(), metrics=accuracy) # channel from 3 to 1 learn$model[0][0][['in_channels']] %f% 1L # reshape new_weight_shape <- torch$nn$parameter$Parameter( (learn$model[0][0]$weight %>% narrow('[:,1,:,:]'))$unsqueeze(1L)) # assign with %f% learn$model[0][0][['weight']] %f% new_weight_shape
Weights and biases could be save and visualized on wandb.ai:
# login for the 1st time then remove it login("API_key_from_wandb_dot_ai") init(project='R')
wandb: Currently logged in as: henry090 (use `wandb login --relogin` to force relogin) wandb: Tracking run with wandb version 0.10.8 wandb: Syncing run macabre-zombie-2 wandb: ⭐️ View project at https://wandb.ai/henry090/speech_recognition_from_R wandb: 🚀 View run at https://wandb.ai/henry090/speech_recognition_from_R/runs/2sjw3juv wandb: Run data is saved locally in wandb/run-20201030_224503-2sjw3juv wandb: Run `wandb off` to turn off syncing.
Now we can train our model:
learn %>% fit_one_cycle(3, lr_max=slice(1e-2), cbs = list(WandbCallback()))
epoch train_loss valid_loss accuracy time ------ ----------- ----------- --------- ----- epoch train_loss valid_loss accuracy time ------ ----------- ----------- --------- ----- WandbCallback requires use of "SaveModelCallback" to log best model 0 0.590236 0.728817 0.787121 04:18 WandbCallback was not able to get prediction samples -> wandb.log must be passed a dictionary 1 0.288492 0.310335 0.908490 04:19 2 0.182899 0.196792 0.941088 04:10
See beautiful dashboard here:
https://wandb.ai/henry090/speech_recognition_from_R/runs/2sjw3juv?workspace=user-henry090
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.