ml_norms: Generate item-level norms for age, sex, language profile and...

View source: R/norms.R

ml_normsR Documentation

Generate item-level norms for age, sex, language profile and item dominance.

Description

This function generates a data frame with the estimated proportion of children that understand and/or produce some items for a selected age range and participant profiles. Estimated proportions and corresponding standard errors and confidence intervals are computed following \insertCitegelman2020regression;textualmultilex's adjustments to account for zero- and one-inflation (see functions prop_adj, prop_adj_se, and prop_adj_ci).

Usage

ml_norms(
  participants = NULL,
  responses = NULL,
  item = NULL,
  language = c("Catalan", "Spanish"),
  type = c("understands", "produces"),
  age = c(0, 100),
  lp = c("Bilingual", "Monolingual", "Other"),
  sex = c("Female", "Male"),
  category = NULL,
  .width = 0.95
)

Arguments

participants

Participants data frame, as generated by ml_participants. If NULL (default), ml_participants is run.

responses

Responses data frame, as generated by ml_responses. If NULL (default), ml_responses is run.

item

Character string indicating the item to compute norms for. If left NULL (by default) norms will be computed for all items. You can check the available items in the pool data set running data("pool").

language

Character string indicating the language to compute vocabulary norms for: "catalan" and/or "spanish"

type

Character string indicating the vocabulary type to compute norms for. Takes "understands" and/or"produces" (defaults to both).

age

Numeric vector of length two (min-max) indicating the age range of participants to compute norms for.

lp

character string indicating the language profile of participants to compute norms for: "Bilingual", "Monolingual", "Other" (defaults to all).

sex

character string indicating the sex of participants to compute norms for. Takes "Female" and/or "Male" (defaults to both).

category

character string indicating the semantic/functional category/ies to include items from. See available categories in the pool data set by running data("pool").

.width

Numeric values ranging from 0 to 1 (not included) indicating the confidence level of confidence intervals (defaults to 0.95).

Value

A data frame (actually, a tibble) with the proportion of participants in the sample that understand or produce the items indicated in item, along with the standard error and confidence interval of the estimation. The output contains the following variables:

te

an integer identifying the Translation Equivalent (aka., pair of cross-language synonyms, doublets) the item belongs to.

item

character string indicating the item identifier (e.g., spa_mesa). This value is unique for each item. Responses to the same item from different participants are linked by the same item value.

language

a character string indicating the language the item response belongs to: Catalan if item in Catalan), Spanish if item in Spanish.

age_bin

an integer indicating the age group participants for which the estimates have been computed belong to (2 months-wide bins by default).

type

a character string indicating the vocabulary type computed: "understands" if option Understands was selected, and "produces" if option Understands & Says was selected.

lp

a character string indicating participants' language profile, classified using parental reports of language exposure (see doe_spanish, doe_catalan, and doe_others), and the thresholds passed in the bilingual_threshold and other_threshold.

category

a character string indicating the semantic/function category the item belongs to (e.g., Vehicles, Actions).

item_dominance

a character string that takes the value L1 if the item belongs to participants' language of most exposure, and L2 if the item belongs to participants' language of least exposure.

label

a character string indicating the text presented to participants in the questionnaire (replacing the item identifier).

yes

a positive integer indicating the number of positive responses: responses is 2 (Understands or 3 (Understands & Says) for type understands, and 3 (Understands & Says) if type is produces.

n

a positive integer indicating the total number number of responses (useful for computing proportions).

proportion

a numeric value ranging from 0 to 1 (both included) indicating the estimated proportion of participants that provided a positive response, adjusted following \insertCitegelman2020regression;textualmultilex's method to account for zero- and one-inflation (see function prop_adj).

se

a numeric value indicating the standard error (SE) of the estimated proportion of participants that provided a positive response, adjusted following \insertCitegelman2020regression;textualmultilex's method to account for zero- and one-inflation (see function prop_adj_se).

ci_lower

a numeric value indicating the lower boundary of the 95% confidence interval (CI) of the estimated proportion of participants that provided a positive response, adjusted following \insertCitegelman2020regression;textualmultilex's method to account for zero- and one-inflation (see function prop_adj_ci).

ci_upper

a numeric value indicating the upper boundary of the 95% confidence interval (CI) of the estimated proportion of participants that provided a positive response, adjusted following \insertCitegelman2020regression;textualmultilex's method to account for zero- and one-inflation (see function prop_adj_ci).

Author(s)

Gonzalo Garcia-Castro

References

\insertRef

gelman2020regressionmultilex

Examples

ml_norms(item = "cat_casa", type = "understands", age = c(20, 24))

gongcastro/multilex documentation built on Oct. 21, 2022, 6:24 p.m.