Description Usage Format Details Source
TD Deletion data from the Buckeye Corpus
1 |
a data frame with 26 columns
Speaker ID from the Buckeye corpus metadata
The recording id
The word in which the token appeared
Time stamp for the word onset in the recording
Time stamp for the word offset in the recording
The part of speech tag from the Buckeye data
Whether the token in question was a /t/ or a /d/
How the final phone of the word was phonetically transcribed
The preceding segment in the cannonical transcription
The following segment in the cannonical transcription
The number of syllables in maximum 8 word window surrounding the target word, based on the dictionary entries
The number of syllables in a maximum 8 word window surrounding the target word, based on the phonetic transcription
Number of preceding words included in the window
Number of following words in the window
Time stamp of the contextual window onset
Time stamp of the contextual window offset
Number of syllables per second, based on the number of syllables in the dictionary entries
Number of syllables per second, based on the phonetic transcription
The word following the token
The broder context in which the token was found
A finer grained coding of grammatical class
A coarser grained coding of grammatical class
A coding of the preceding segment
A coding of the following segment
A finer grained coding of the realization of the /t/ or /d/
A coarse grained coding of the /t/ or /d/ into a 1 or 0
This data was automatically generated from the Buckeye corpus by comparing the canonical transcription for each word to its phonetic transcription in the corpus. It includes estimates for rate of speech (syllables per second), as well as two different sets of morphological coding.
The coding scheme for Gram
is as follows
and
The word "and"
justT
Past tense and participial forms that just have a final [d] -> [t] change. specifically, "built", "sent", and "spent".
mono
Any word that doesn't have verbal morphology, and isn't a contraction.
nochange
No-change past tense forms. Specifically, "cost", "burst", "cast" and its contractions
nt
Not contraction, e.g. "don't".
past
Regular past tense
semiweakD
Verbs that have a stem change and add /d/. Specifically "heard","sold", "told", "unheard".
semiweakT
Verbs that have a stem change and add /t/. e.g. "felt", "kept"
stemchange
Verbs that have a stem change, and no apparent affix. Specifically "bound", "found", and "held"
went
The word "went"
The coding scheme for Gram2
is identical, except that it collapses the semiweakD
and semiweakT
categories from Gram
.
Pitt, M. A., Dilley, L., Johnson, K., Kiesling, S., Raymond, W., Hume, E., & Fosler-Lussier, E. (2007). Buckeye Corpus of Conversational Speech (2nd release). Columbus, OH. Retrieved from www.buckeyecorpus.osu.edu
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.