audio_text: IBM Watson Audio Transcriber
In ColumbusCollaboratory/cognizer: Access to Cognitive APIs

Description Usage Arguments Value

View source: R/audio_cognizers.R

Convert your audio to transcripts with optional keyword detection and profanity cleaning.

audio_text(audios, userpwd, keep_data = "true", callback = NULL,
  model = "en-US_BroadbandModel", continuous = FALSE,
  inactivity_timeout = 30, keywords = list(), keywords_threshold = NA,
  max_alternatives = 1, word_alternatives_threshold = NA,
  word_confidence = FALSE, timestamps = FALSE, profanity_filter = TRUE,
  smart_formatting = FALSE, content_type = "audio/wav")

`audios`	Character vector (list) of paths to images or to .zip files containing upto 100 images.
`userpwd`	Character scalar containing username:password for the service.
`keep_data`	Character scalar specifying whether to share your data with Watson services for the purpose of training their models.
`callback`	Function that can be applied to responses to examine http status, headers, and content, to debug or to write a custom parser for content. The default callback parses content into a data.frame while dropping other response values to make the output easily passable to tidyverse packages like dplyr or ggplot2. For further details or debugging one can pass a print or a more compicated function.
`model`	Character scalar specifying language and bandwidth model. Alternatives are ar-AR_BroadbandModel, en-UK_BroadbandModel, en-UK_NarrowbandModel, en-US_NarrowbandModel, es-ES_BroadbandModel, es-ES_NarrowbandModel, fr-FR_BroadbandModel, ja-JP_BroadbandModel, ja-JP_NarrowbandModel, pt-BR_BroadbandModel, pt-BR_NarrowbandModel, zh-CN_BroadbandModel, zh-CN_NarrowbandModel.
`continuous`	Logical scalar specifying whether to return after a first end-of-speech incident (long pause) or to wait to combine results.
`inactivity_timeout`	Integer scalar giving the number of seconds after which the result is returned if no speech is detected.
`keywords`	List of keywords to be detected in the speech stream.
`keywords_threshold`	Double scalar from 0 to 1 specifying the lower bound on confidence to accept detected keywords in speech.
`max_alternatives`	Integer scalar giving the maximum number of alternative transcripts to return.
`word_alternatives_threshold`	Double scalar from 0 to 1 giving lower bound on confidence of possible words.
`word_confidence`	Logical scalar indicating whether to return confidence for each word.
`timestamps`	Logical scalar indicating whether to return time alignment for each word.
`profanity_filter`	Logical scalar indicating whether to censor profane words.
`smart_formatting`	Logical scalar indicating whether dates, times, numbers, etc. are to be formatted nicely in the transcript.
`content_type`	Character scalar showing format of the audio file. Alternatives are audio/flac, audio/l16;rate=n;channels=k (16 channel limit), audio/wav (9 channel limit), audio/ogg;codecs=opus, audio/basic (narrowband models only).