Description Usage Arguments Value
View source: R/audio_cognizers.R
Convert your audio to transcripts with optional keyword detection and profanity cleaning.
1 2 3 4 5 6 | audio_text(audios, userpwd, keep_data = "true", callback = NULL,
model = "en-US_BroadbandModel", continuous = FALSE,
inactivity_timeout = 30, keywords = list(), keywords_threshold = NA,
max_alternatives = 1, word_alternatives_threshold = NA,
word_confidence = FALSE, timestamps = FALSE, profanity_filter = TRUE,
smart_formatting = FALSE, content_type = "audio/wav")
|
audios |
Character vector (list) of paths to images or to .zip files containing upto 100 images. |
userpwd |
Character scalar containing username:password for the service. |
keep_data |
Character scalar specifying whether to share your data with Watson services for the purpose of training their models. |
callback |
Function that can be applied to responses to examine http status, headers, and content, to debug or to write a custom parser for content. The default callback parses content into a data.frame while dropping other response values to make the output easily passable to tidyverse packages like dplyr or ggplot2. For further details or debugging one can pass a print or a more compicated function. |
model |
Character scalar specifying language and bandwidth model. Alternatives are ar-AR_BroadbandModel, en-UK_BroadbandModel, en-UK_NarrowbandModel, en-US_NarrowbandModel, es-ES_BroadbandModel, es-ES_NarrowbandModel, fr-FR_BroadbandModel, ja-JP_BroadbandModel, ja-JP_NarrowbandModel, pt-BR_BroadbandModel, pt-BR_NarrowbandModel, zh-CN_BroadbandModel, zh-CN_NarrowbandModel. |
continuous |
Logical scalar specifying whether to return after a first end-of-speech incident (long pause) or to wait to combine results. |
inactivity_timeout |
Integer scalar giving the number of seconds after which the result is returned if no speech is detected. |
keywords |
List of keywords to be detected in the speech stream. |
keywords_threshold |
Double scalar from 0 to 1 specifying the lower bound on confidence to accept detected keywords in speech. |
max_alternatives |
Integer scalar giving the maximum number of alternative transcripts to return. |
word_alternatives_threshold |
Double scalar from 0 to 1 giving lower bound on confidence of possible words. |
word_confidence |
Logical scalar indicating whether to return confidence for each word. |
timestamps |
Logical scalar indicating whether to return time alignment for each word. |
profanity_filter |
Logical scalar indicating whether to censor profane words. |
smart_formatting |
Logical scalar indicating whether dates, times, numbers, etc. are to be formatted nicely in the transcript. |
content_type |
Character scalar showing format of the audio file. Alternatives are audio/flac, audio/l16;rate=n;channels=k (16 channel limit), audio/wav (9 channel limit), audio/ogg;codecs=opus, audio/basic (narrowband models only). |
List of parsed responses.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.