childes_tokens: child_tokens

Description Usage Format Source

Description

A data frame containing tokenized utterances of children from 14 corpora in the CHILDES database (2018.1). This data frame is a tokenized version of the childes_utterances data frame.

Usage

1

Format

A data frame with 1,021,343 rows representing individual tokens and 7 columns:

collection

a factor denoting the CHILDES collection of the corpus

corpus

a factor denoting corpus name

child

a character string giving the child's name

sex

a factor giving the child's sex

age

a double giving the child's age

transcript_id

integer denoting the transcript ID of an utterance

token

a character string giving a token

Source

Data is from the CHILDES Database (https://childes.talkbank.org) and is accessed using the childesr package (https://github.com/langcog/childesr).


gracelawley/childes documentation built on May 7, 2019, 6:07 a.m.