R/sdac.R

#' Switchboard Dialog Act Corpus
#'
#' A dataset containing the 1,150 conversations of 440 speakers of American
#' English. More information on the metadata in this data can be found here \code{\link{https://catalog.ldc.upenn.edu/docs/LDC97S62/swb1_manual.txt}}.
#'
#' @format A data frame with 223,606 rows and 20 variables:
#' \describe{
#'   \item{doc_id}{ID for each conversation document}
#'   \item{damsl_tag}{DAMSL dialog act annotation labels}
#'   \item{speaker}{Label for each speaker in the conversation}
#'   \item{turn_num}{Number of contiguous utterance turns for a given speaker}
#'   \item{utterance_num}{The cumulative number of utterances in the conversation}
#'   \item{utterance_text}{The actual dialog utterance}
#'   \item{speaker_id}{Unique speaker identification code}
#'   \item{sex}{Sex of the speaker}
#'   \item{birth_year}{Year that the speaker was born}
#'   \item{dialect_area}{Region from the US where the speaker spent first 10 years}
#'   \item{education}{Highest educational level attained}
#'   \item{ti}{...}
#'   \item{payment_type}{Form of payment for participation}
#'   \item{amt_pd}{Payment amount for participation}
#'   \item{remarks}{Misc. comments}
#'   \item{calls_deleted}{...}
#'   \item{speaker_partition}{...}
#' }
#' @source \url{https://catalog.ldc.upenn.edu/docs/LDC97S62/}
"sdac"
WFU-TLC/analyzr documentation built on June 4, 2019, 2:27 p.m.