R/sbc.R

#'  Santa Barbara Corpus of Spoken American English
#'
#' A dataset containing the 15,475 utterances by 44 speakers of American
#' English.
#'
#' @format A data frame with 15,475 rows and 13 variables:
#' \describe{
#'   \item{id}{ID for each speaker}
#'   \item{name}{Name of each speaker}
#'   \item{gender}{Gender of the speaker}
#'   \item{age}{Age of the speaker at recording}
#'   \item{dialect}{Dialect self-assessment for each speaker}
#'   \item{dialect_state}{State where each speaker was raised}
#'   \item{current_state}{State of residence for each speaker at recording}
#'   \item{highest_edu}{Highest educational degree obtained}
#'   \item{years_edu}{Number of years in the educational setting}
#'   \item{occupation}{Occupation of the speaker at recording}
#'   \item{ethnicity}{Ethnicity self-assessment for each speaker}
#'   \item{utterance}{Annotated transcription of a speaker's utterance}
#'   \item{utterrance_clean}{Simplified transcription of a speaker's utterance}
#' }
#' @source \url{http://www.linguistics.ucsb.edu/research/santa-barbara-corpus}
"sbc"
WFU-TLC/analyzr documentation built on June 4, 2019, 2:27 p.m.