swda | R Documentation |
A dataset containing 1,155 5-minute conversations of 441 speakers of American English created in 1997 and tagged with a shallow discourse tagset of approximately 60 basic dialog act tags (DAMSL) and combinations.
data("swda")
A data frame with 223,606 observations on the following 16 variables.
doc_id
ID for each conversation document
topic_num
Topic number associated with the conversation
topicality
Subjective rating of the annotator whether the callers conversed generally about what was suggested by the recorded prompt. Scale of 1 to 5, 1 being most on topic.
naturalness
Subjective rating of the annotator whether the the conversation sounded natural. Scale of 1 to 5, 1 being the most natural.
damsl_tag
DAMSL dialog act annotation labels
speaker
Label for each speaker in the conversation
turn_num
Number of contiguous utterance turns for a given speaker
utterance_num
The cumulative number of utterances in the conversation
utterance_text
The actual dialog utterance. Includes disfluency annotation (see details below)
speaker_id
ID for each speaker
sex
The biological sex of the speaker
birth_year
Year that the speaker was born
dialect_area
Region from the US where the speaker spent first 10 years
education
Highest educational level attained: values 0, 1, 2, 3, and 9
topic
Topic description
topic_prompt
Specific topic prompt for the conversation
More information on the metadata in this data can be found here: https://catalog.ldc.upenn.edu/docs/LDC97S62/swb1_manual.txt.
The SWBD-DAMSL manual can be found here: https://web.stanford.edu/~jurafsky/ws97/manual.august1.html.
The Dysfluency Annotation Stylebook for the Switchboard Corpus can be found here: https://staff.fnwi.uva.nl/r.fernandezrovira/teaching/DM-materials/DFL-book.pdf.
Switchboard-1 Release 2 https://catalog.ldc.upenn.edu/docs/LDC97S62/
Godfrey, John J., and Edward Holliman. Switchboard-1 Release 2 LDC97S62. Web Download. Philadelphia: Linguistic Data Consortium, 1993.
Jurafsky, Daniel, Elizabeth Shriberg, and Debra Biasca. 1997. "Switchboard SWBD-DAMSL Shallow-Discourse-Function Annotation Coders Manual, Draft 13" University of Colorado, Boulder Institute of Cognitive Science Technical Report 97-02
Meteer, Marie and Ann Taylor. 1995. Dysfluency Annotation Stylebook for the Switchboard Corpus
data(swda)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.