compose_data_to_infer_NIW_ideal_adaptor: Compose data for input to RStan

View source: R/compose-input-for-stan.R

compose_data_to_infer_NIW_ideal_adaptorR Documentation

Compose data for input to RStan

Description

Take exposure and test data as input, and prepare the data for input into an MVBeliefUpdatr Stan program.

Usage

compose_data_to_infer_NIW_ideal_adaptor(
  exposure,
  test,
  cues,
  category = "category",
  response = "response",
  group = "group",
  group.unique = NULL,
  center.observations = T,
  scale.observations = T,
  pca.observations = F,
  pca.cutoff = 1,
  lapse_rate = NULL,
  mu_0 = NULL,
  Sigma_0 = NULL,
  tau_scale = 0,
  L_omega_scale = 0,
  use_univariate_updating = FALSE,
  verbose = F
)

Arguments

exposure

'tibble' or 'data.frame' with the exposure data. Each row should be an observation of a category, and contain information about the category label, the cue values of the observation, and optionally grouping variables.

test

'tibble' or 'data.frame' with the test data. Each row should be an observation, and contain information about the cue values of the test stimulus and the participant's response.

cues

Names of columns with cue values. Must exist in both exposure and test data.

category

Name of column in exposure data that contains the category label. Can be NULL for unsupervised updating (not yet implemented). (default: "category")

response

Name of column in test data that contains participants' responses. (default: "response")

group

Name of column that contains information about which observations form a group. Typically, this is a variable identifying subjects/participants. Must exist in both exposure and test data. (default: "group")

group.unique

Name of column that uniquely identifies each group with identical exposure. This could be a variable indicating the different conditions in an experiment. Using group.unique is optional, but can be substantially more efficient if many groups share the same exposure. To ignore, set to NULL. (default: NULL)

center.observations

Should the data be centered based on cues' means during exposure? Note that the cues' means used for centering are calculated after aggregating the data to all unique combinations specified by group.unique. These means are only expected to be the same as the standard deviations over the entire exposure data if the exposure data are perfectly balanced with regard to group.unique. Centering will not affect the inferred correlation or covariance matrices but it will affect the absolute position of the inferred means. The relative position of the inferred means remains unaffected. If TRUE and mu_0 is specified, mu_0 will also be centered (Sigma_0 is not affected by centering and thus not changed). (default: TRUE)

scale.observations

Should the data be standardized based on cues' standard deviation during exposure? Note that the cues' standard deviations used for scaling are calculated after aggregating the data to all unique combinations specified by group.unique. These standard deviations are only expected to be the same as the standard deviations over the entire exposure data if the exposure data are perfectly balanced with regard to group.unique. Scaling will not affect the inferred correlation matrix, but it will affect the inferred covariance matrix because it affects the inferred standard deviations. It will also affect the absolute position of the inferred means. The relative position of the inferred means remains unaffected. If TRUE and mu_0 and Sigma_0 are specified, mu_0 and Sigma_0 will also be scaled. (default: 'TRUE')

pca.observations

Should the data be transformed into orthogonal principal components? (default: FALSE)

pca.cutoff

Determines which principal components are handed to the MVBeliefUpdatr Stan program: all components necessary to explain at least the pca.cutoff of the total variance. (default: .95) Ignored if pca.observation = FALSE. (default: 1)

lapse_rate, mu_0, Sigma_0

Optionally, lapse rate, prior expected category means (mu_0) and/or prior expected category covariance matrices (Sigma_0) for all categories. Lapse rate should be a number between 0 and 1. For mu_0 and Sigma_0, each should be a list, with each element being the expected mean/covariance matrix for a specific category prior to updating. Elements of mu_0 and Sigma_0 should be ordered in the same order as the levels of the category variable in exposure and test. These prior expected means and covariance matrices could be estimated, for example, from phonetically annotated speech recordings (see make_MVG_from_data for a convenient way to do so). Internally, m_0 is then set to mu_0 (so that the expected value of the prior distribution of means is mu_0) and S_0 is set so that the expected value of the inverse-Wishart is Sigma_0 given nu_0. Importantly, Sigma_0 should be convolved with perceptual noise (i.e., add perceptual noise covariance matrix to the category variability covariance matrices when you specify Sigma_0) since the stancode for the inference of the NIW ideal adaptor does not infer category and noise variability separately.

tau_0_scales

Optionally, a vector of scales for the Cauchy priors for each cue's standard deviations. Used in both the prior for m_0 and the prior for S_0. (default: vector of 5s of length of cues, assumes scaled input)

omega_0_eta

Optionally, etas the LKJ prior for the correlations of the covariance matrix of mu_0. Set to 0 to ignore. (default: 0)

Details

It is important to use group to identify individuals that had a specific exposure (or no exposure at all) and specific test trials. You should not use group to identify exposure conditions. Setting group to an exposure condition results in an exposure that concatenates the exposure observations from all subjects in that condition. Typically, this is not what users intend, as it models exposure to the combination of exposure tokens across all subjects, rather than exposure to one set of those exposure tokens. To achieve this intended outcome, use group.unique to identify groups with identical exposure. This will correctly use only one unique instance of the observations that any level of group receives during exposure.

Value

A list consisting of a data_list and transform_information. The former that is an NIW_ideal_adaptor_input.

See Also

is.NIW_ideal_adaptor_input


hlplab/MVBeliefUpdatr documentation built on March 29, 2025, 10:42 p.m.