ml_vocabulary: Generate participant information and progress for each...

View source: R/vocabulary.R

ml_vocabularyR Documentation

Generate participant information and progress for each response

Description

This function generates a data frame with the vocabulary of each participant (keeping longitudinal data from the same participant in different rows). Comprehensive and productive vocabulary sizes are computed as raw counts (vocab_count) and as proportions vocab_prop, calculated from the total of items filled by the participant in the response vocab_n).

Usage

ml_vocabulary(
  participants = NULL,
  responses = NULL,
  by = NULL,
  scale = "count"
)

Arguments

participants

Participants data frame, as generated by ml_participants. If NULL (default), ml_participants is run.

responses

Responses data frame, as generated by ml_responses. If NULL (default), ml_responses is run.

by

A character vector that takes the name of the variable(s) to group data into. Vocabulary metrics will be calculated by aggregating responses within the groups that result from the combination of crossing of the variables provided in by. This variables can refer to item properties (see pool, e.g., "category") or to participant properties (see ml_logs(), e.g., "dominance").

scale

A character vector that takes the value "count" and/or "prop". If "count" (default) vocabulary metrics are reported as counts (number of words). If "prop", vocabulary metrics are calculated as proportions?

Value

A dataset (actually, a tibble) with each participant's comprehensive and/or vocabulary size in each language. This data frame contains the following variables:

id

a character string indicating a participant's identifier. This value is always the same for each participant, so that different responses from the same participant share the same id.

time

a numeric value indicating how many times a given participant has been sent the questionnaire, regardless of whether they completed it or not.

age

a numeric value indicating the number of months elapsed since participants' birth date until they filled in the last item of their questionnaire response.

type

a character string indicating the vocabulary type computed: "understands" if option Understands was selected, and "produces" if option Understands & Says was selected.

vocab_count_total

integer indicating the number of items selected as Understands or Understands and Says in both languages.

vocab_count_dominance_l1

positive integer indicating the number of items selected as Understands or Understands and Says in the dominant language (L1).

vocab_count_dominance_l2

positive integer indicating the number of items selected as Understands or Understands and Says in the non-dominant language (L2).

vocab_count_conceptual

positive integer indicating the number of translation equivalents (aka. cross-language synonyms or doublets) in which at list one of the items was selected as Understands or Understands and Says. This is a measure of the number of lexicalised concepts.

vocab_count_te

positive integer indicating the number of translation equivalents (out of the total number of items the participant answered to) in which at both items was selected as Understands or Understands and Says. This is a measure of the number of lexicalised concepts.

vocab_prop_total

numeric value ranging from 0 to 1 (both included) indicating the proportion of items selected as Understands or Understands and Says in both languages.

vocab_prop_dominance_l1

numeric value ranging from 0 to 1 (both included) indicating the proportion of of items selected as Understands or Understands and Says in the dominant language (L1).

vocab_prop_dominance_l2

numeric value ranging from 0 to 1 (both included) indicating the proportion of of items selected as Understands or Understands and Says in the non-dominant language (L2).

vocab_prop_conceptual

numeric value ranging from 0 to 1 (both included) indicating the proportion of of translation equivalents (aka. cross-language synonyms or doublets) in which at list one of the items was selected as Understands or Understands and Says. This is a measure of the number of lexicalised concepts.

vocab_prop_te

numeric value ranging from 0 to 1 (both included) indicating the proportion of of translation equivalents (aka. cross-language synonyms or doublets) in which at both items was selected as Understands or Understands and Says. This is a measure of the number of lexicalised concepts.

The specific subset of columns returned by ml_vocabulary depends on the arguments provided.

Author(s)

Gonzalo Garcia-Castro


gongcastro/multilex documentation built on Oct. 21, 2022, 6:24 p.m.