This package contains datasets created from a collection of 14 English child language corpora in the Child Language Data Exchange System (CHILDES). The transcripts are from the childes-db
(version 2018.1) and are accessed with the childesr
package.
More information about the individual corpora can be found via the CHILDES website.
This subset of corpora was chosen from the English North American collection of CHILDES based on the following criteria.
This package currently contains three datasets:
childes_utterances
: utterances of children from the 14 corporachildes_tokens
: tokenized utterances of children from the 14 corporachildes_types
: counts of words types in childes_tokens
across all 14 corpora# install.packages("remotes")
remotes::install_github("gracelawley/childes")
CHILDES is a part of the TalkBank database and asks users to follow the data usage rules found here.
Except where otherwise indicated, the use of TalkBank data is governed by the Creative Commons CC BY-NC-SA 3.0 license.
CHILDES: MacWhinney, B. (2000). The CHILDES Project: Tools for analyzing talk. Third Edition. Mahwah, NJ: Lawrence Erlbaum Associates.
Sanchez, A., Meylan, S., Braginsky, M., MacDonald, K. E., Yurovsky, D., & Frank, M. C. (2018, April 23). childes-db: a flexible and reproducible interface to the Child Language Data Exchange System. Retrieved from psyarxiv.com/93mwx
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.