View source: R/reindeeR_annotate.R
annotate_voiceactivity | R Documentation |
Voice activity detection is applied to a database to find portions of speech signals where spoken communication is likely to have occured, to make transcription work more efficient in databases with silences or many small utterances that should be disrecarded. The segmentation is intended to be used for indexing and easy navigation of the database only, and should not inserted into a hierarchy of levels. The intended use of this function is instead to supply the result of a "VAD == SPEECH" query call to a serve or reindeer:write_bundleList so that the annotations can be used for efficient navigation of a database. If not helpful in the recording settings used, the user can rerun this function and with more applicable thresholds, overwriting previously generated labels.
annotate_voiceactivity(
emuDBhandle,
auth_key,
levelname = "VAD",
speech_probability_threshold = 0.6,
nospeech_probability_threshold = 0.4,
minimum_speech_duration = 0.2,
minimum_nonspeech_duration = 0.1
)
emuDBhandle |
An emuR database handle. |
auth_key |
A Hugging Face 'User Access Token' for a user which has activated access to the pyannote/segmentation model. |
levelname |
The name of fhe segmentation level (and attribute) to create to hold the annotations of speech. |
speech_probability_threshold |
The probability threshold above which the model will percieve the signal to contain speech. |
nospeech_probability_threshold |
The probability threshold below which the model will percieve the signal to contain non-speech. |
minimum_speech_duration |
The minimum duration of a section of speech to consider (in seconds). |
minimum_nonspeech_duration |
The minimum duration of a portion that could be non-speech (in seconds). |
Sections thought to contain speech will be marked in the levelname
level by a SEGMENT with the label SPEECH. The levelname
level will be
cleared before inserting labels if this function is applied again to the
database. The speech segmentation model of the pyannote-audio framework
is used in speech segementation \insertCiteBredin.2019,Bredin.2021reindeer
A tibble
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.