View source: R/curate_swda_data.R
curate_swda_data | R Documentation |
Process and curate Switchboard Dialog Act (SWDA) data by reading all .utt files from a specified directory and converting them into a structured format.
curate_swda_data(dir_path)
dir_path |
Character string. Path to the directory containing .utt files. Must be an existing directory. |
The function expects a directory containing .utt files or subdirectories with .utt files, as found in the raw SWDA data (Linguistic Data Consortium. LDC97S62: Switchboard Dialog Act Corpus.)
A data frame containing the curated SWDA data with columns:
doc_id: Document identifier
damsl_tag: Dialog act annotation
speaker_id: Unique speaker identifier
speaker: Speaker designation (A or B)
turn_num: Turn number in conversation
utterance_num: Utterance number
utterance_text: Actual spoken text
# Example using simulated data bundled with the package
example_data <- system.file("extdata", "simul_swda", package = "qtkit")
swda_data <- curate_swda_data(example_data)
str(swda_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.