getTokens: Get token metadata data

getTokensR Documentation

Get token metadata data

Description

Query a table with all the words from the selected transcripts, one word (token) per row. Each row has:

  • Link to view transcript and play any associated media.

  • Corpus path to transcript.

  • Utterance sequence number (starts at 0).

  • Word sequence number within utterance (starts at 0).

  • Speaker's role.

  • Speaker's ID.

  • The word (token).

  • The word's stem.

  • Part of speech code. (See CHAT manual for descriptions of codes).

Usage

getTokens(
  corpusName = NULL,
  corpora = NULL,
  lang = NULL,
  media = NULL,
  age = NULL,
  gender = NULL,
  designType = NULL,
  activityType = NULL,
  groupType = NULL,
  auth = FALSE
)

Arguments

corpusName

Name of corpus to query. For example, to search within the childes corpus, corpus="childes".

corpora

Name of corpus/corporas to query. This is a path starting with the corpus name followed by subfolder names leading to a folder for which all transcripts beneath it will be queried. For example, to query all transcripts in the MacWhinney childes corpus: corpora = c('childes', 'Eng-NA', 'MacWhinney').

lang

Query by language For example, to get transcripts that contain both English and Spanish: lang=c("eng", "spa"). Legal values: 3-letter language codes based on the ISO 639-3 standard.

media

Query by media type. For example, to get transcripts with an associated video recording: media=c("video"). Legal values: "audio" or "video".

age

Query by participant month age range. For example, to get transcripts with target participants who are 14-18 months old: age=c(from="3", to="12").

gender

Query by participant gender. For example, to get transcripts with female target participants: gender=c("female"). Legal values: "female" or "male".

designType

Query by design type. For example, to get transcripts from a longitudinal study: designType=c("long") Legal values are "long" for longitudinal studies, "cross" for cross-sectional studies.

activityType

Query by activity type. For example, to get transcripts where the target participant is engaged in toy play: activityType=c("toyplay"). See the CHAT manual for legal values.

groupType

Query by group type. For example, to get transcripts where the target participant is hearing limited: groupType=c("HL"). See the CHAT manual for legal values.

auth

Determine if user should be prompted to authenticate in order to access protected collections. Defaults to False.

Examples

getTokens(corpusName = 'childes',
          corpora = c('childes',
                              'Eng-NA',
                              'MacWhinney',
                              '010411a'))

TalkBank/TBDBr documentation built on Feb. 4, 2024, 2:25 p.m.