NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") knitr::opts_chunk$set( purl = NOT_CRAN, collapse = TRUE, comment = "#>" ) keyring::key_set_with_value( "multilex", "gonzalo.garciadecastro@upf.edu", Sys.getenv("ML_KEY") )
library(multilex) my_email <- "gonzalo.garciadecastro@upf.edu" ml_connect(google_email = my_email)
The ml_vocabulary
function allows to extract vocabulary sizes for individual responses to any of the questionnaires:
ml_connect() p <- ml_participants() r <- ml_responses() ml_vocabulary(participants = p, responses = r)
Vocabulary sizes are, by default, computed in two different scales:
By default, four modalities of vocabulary size are computed:
Vocabulary sizes are also computed in two types:
This is what the default output looks like:
library(multilex) ml_connect() p <- ml_participants() r <- ml_responses(update = FALSE) ml_vocabulary(participants = p, responses = r)
This data frame includes two rows per response: one for comprehensive vocabulary and one for productive vocabulary, and includes the following columns:
id
: participant ID. This ID is unique for every participant and is the same across all responses to the questionnaire from the same participant.time
: how many times has this participant completed any of the questionnaires, including this one?age
: age in months at time of completiontype
: vocabulary size type (understands
for comprehension, produces
for production)vocab_count_total
: total number of item the child was reported to know, summing both languages togethervocab_count_dominance_l1
: number of items the child was reported to know in their dominant language (e.g., Catalan words for a child whose language of most exposure is Catalan)vocab_count_dominance_l2
: number of items the child was reported to know in their non-dominant language (e.g., Spanish words for a child whose language of most exposure is Catalan)vocab_count_conceptual
: number of concepts the child know at least one item for (regadless of the language the item belongs to).vocab_count_te
: number of translation equivalents the child knows (how many concepts the child know one item in each language for).This is what the output looks like when scale = "prop"
:
ml_vocabulary(p, r, scale = "prop")
This data frame follows a similar structure to the one returned by ml_vocabulary
when run with default arguments, but vocabulary sizes are now expressed as proportions:
id
: participant ID. This ID is unique for every participant and is the same across all responses to the questionnaire from the same participant.time
: how many times has this participant completed any of the questionnaires, including this one?age
: age in months at time of completiontype
: vocabulary size type (understands
for comprehension, produces
for production)vocab_prop_total
: proportion number of item the child was reported to know, summing both languages togethervocab_prop_dominance_l1
: proportion of items the child was reported to know in their dominant language (e.g., Catalan words for a child whose language of most exposure is Catalan)vocab_prop_dominance_l2
: proportion of items the child was reported to know in their non-dominant language (e.g., Spanish words for a child whose language of most exposure is Catalan)vocab_prop_conceptual
: proportion of concepts the child know at least one item for (regadless of the language the item belongs to).vocab_prop_te
: proportion of translation equivalents the child knows (how many concepts the child know one item in each language for).We can also ask for vocabulary sizes expressed in both scales (counts and proportions):
ml_vocabulary(p, r, scale = c("count", "prop"))
by
argumentWe can also compute vocabulary sizes conditional to some variables at the item or participant level, such as semantic/functional category (category
), cognate status (cognate
) or language profile (lp
), using the argument by
. Just take a look the variables included i nthe data frames returned by ml_participants()
or in the pool
of items. You can use this argument as:
ml_vocabulary(p, r, by = "dominance")
This data frame follows a similar structure as the ones above, but preserves a column for the variable category
, which indexes that functiona/semantic category the items belongs to. The value of this argument is passed to dplyr's group_by
under the hood. As with group_by
, you can compute vocabulary sizes for combinations of variables:
ml_vocabulary(p, r, by = c("dominance", "lp"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.