This package contains datasets created from a collection of 14 English child language corpora in the Child Language Data Exchange System (CHILDES). The transcripts are from the childes-db (version 2018.1) and are accessed with the childesr package.

The 14 Corpora

More information about the individual corpora can be found via the CHILDES website.

Inclusion/Exclusion Criteria

This subset of corpora was chosen from the English North American collection of CHILDES based on the following criteria.



Language Sampling Procedure


This package currently contains three datasets:


# install.packages("remotes")

CHILDES Usage Rules

CHILDES is a part of the TalkBank database and asks users to follow the data usage rules found here.

Except where otherwise indicated, the use of TalkBank data is governed by the Creative Commons CC BY-NC-SA 3.0 license.

Works Cited

CHILDES: MacWhinney, B. (2000). The CHILDES Project: Tools for analyzing talk. Third Edition. Mahwah, NJ: Lawrence Erlbaum Associates.

Sanchez, A., Meylan, S., Braginsky, M., MacDonald, K. E., Yurovsky, D., & Frank, M. C. (2018, April 23). childes-db: a flexible and reproducible interface to the Child Language Data Exchange System. Retrieved from

gracelawley/childes documentation built on May 7, 2019, 6:07 a.m.