Home

/

GitHub

/

README.md
In gracelawley/childes: A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)

childes

This package contains datasets created from a collection of 14 English child language corpora in the Child Language Data Exchange System (CHILDES). The transcripts are from the childes-db (version 2018.1) and are accessed with the childesr package.

More information about the individual corpora can be found via the CHILDES website.

This subset of corpora was chosen from the English North American collection of CHILDES based on the following criteria.

Part of the Eng-NA section of CHILDES
Sufficient information about the details of the study and corpus is available

36-96 months old
English as first and primary language
No reported gross sensory impairments (e.g. hearing impairment), congenital defects, developmental disabilities, or atypical development
No significant/regular exposure to another language (i.e. 75% or higher consistent exposure to a language other than English)

Naturalistic and unscripted elicitations (in either naturalistic or laboratory settingss)
Intelligible speech
One-on-one conversations (e.g. child-examiner conversations or parent-child conversations)
Can be child-child conversations as long as both children meet the participant requirements
No conversations amongst a group of children
No reading from books, etc.
No restricted vocabulary that is caused by the structure of the study or the experiment design
e.g. no samples of free play sesssions for multiple participants that each involve the same set of experimenter-provided toys No structured speech
e.g. speech from an interview that has been tailored for a specific experimental interest(s)

This package currently contains three datasets:

childes_utterances: utterances of children from the 14 corpora
childes_tokens: tokenized utterances of children from the 14 corpora
childes_types: counts of words types in childes_tokens across all 14 corpora

# install.packages("remotes")
remotes::install_github("gracelawley/childes")

CHILDES is a part of the TalkBank database and asks users to follow the data usage rules found here.

Except where otherwise indicated, the use of TalkBank data is governed by the Creative Commons CC BY-NC-SA 3.0 license.

CHILDES: MacWhinney, B. (2000). The CHILDES Project: Tools for analyzing talk. Third Edition. Mahwah, NJ: Lawrence Erlbaum Associates.

Sanchez, A., Meylan, S., Braginsky, M., MacDonald, K. E., Yurovsky, D., & Frank, M. C. (2018, April 23). childes-db: a flexible and reproducible interface to the Child Language Data Exchange System. Retrieved from psyarxiv.com/93mwx

gracelawley/childes documentation built on May 7, 2019, 6:07 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

gracelawley/childes
A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)

README.md
In gracelawley/childes: A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)

childes

The 14 Corpora

Inclusion/Exclusion Criteria

Corpora

Participants

Language Sampling Procedure

Datasets

Installation

CHILDES Usage Rules

Works Cited

R Package Documentation

Browse R Packages

We want your feedback!

gracelawley/childes A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)

README.md In gracelawley/childes: A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)

childes

The 14 Corpora

Inclusion/Exclusion Criteria

Corpora

Participants

Language Sampling Procedure

Datasets

Installation

CHILDES Usage Rules

Works Cited

R Package Documentation

Browse R Packages

We want your feedback!

gracelawley/childes
A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)

README.md
In gracelawley/childes: A Collection of Child Language Samples from 14 Corpora in CHILDES (2018.1)