fit_rtmpt: Posterior sample, diagnostics and some optional stuff for...

View source: R/call_rtmpt.R

fit_rtmptR Documentation

Posterior sample, diagnostics and some optional stuff for RT-MPT models

Description

Given model and data, this function calls an altered version of the C++ program by Klauer and Kellen (2018) to sample from the posterior distribution via a Metropolis-Gibbs sampler and storing it in an mcmc.list called samples. Posterior predictive checks developed by Klauer (2010), deviance information criterion (DIC; Spiegelhalter et al., 2002), 99% and 95% highest density intervals (HDI) together with the median will be provided for the main parameters in a list called diags. Optionally, the indices widely applicable information criterion (WAIC; Watanabe, 2010; Vehtari et al., 2017) and leave-one-out cross-validation (LOO; Vehtari et al., 2017) can be saved. Additionally the log-likelihood (LogLik) can also be stored. Some specifications of the function call are also saved in specs.

Usage

fit_rtmpt(
  model,
  data,
  n.chains = 4,
  n.iter = 5000,
  n.burnin = 200,
  n.thin = 1,
  Rhat_max = 1.05,
  Irep = 1000,
  prior_params = NULL,
  indices = FALSE,
  save_log_lik = FALSE,
  old_label = FALSE
)

Arguments

model

A list of the class rtmpt_model.

data

Optimally, a list of class rtmpt_data. Also possible is a data.frame or a path to the text file. Both, data.frame and the text file must contain the column names "subj", "group", "tree", "cat", and "rt" preferably but not necessarily in this order. The values of the latter must be in milliseconds. It is always advised to use to_rtmpt_data first, which gives back an rtmpt_data list with informations about the changes in the data, that were needed.

n.chains

Number of chains to use. Default is 4. Must be larger than 1 and smaller or equal to 16.

n.iter

Number of samples per chain. Default is 5000.

n.burnin

Number of warm-up samples. Default is 200.

n.thin

Thinning factor. Default is 1.

Rhat_max

Maximal Potential scale reduction factor: A lower threshold that needs to be reached before the actual sampling starts. Default is 1.05

Irep

Every Irep samples an interim state with the current maximal potential scale reduction factor is shown. Default is 1000. The following statements must hold true for Irep:

  • n.burnin is smaller than or equal to Irep,

  • Irep is a multiple of n.thin and

  • n.iter is a multiple of Irep / n.thin.

prior_params

Named list with prior parameters. All parameters have default values, that lead to uninformative priors. Vectors are not allowed. Allowed parameters are:

  • mean_of_exp_mu_beta: This is the a priori expected exponential rate (E(exp(beta)) = E(lambda)) and 1/mean_of_exp_mu_beta is the a priori expected process time (1/E(exp(beta)) = E(tau)). The default mean is set to 10, such that the expected a priori process time is 0.1 seconds.

  • var_of_exp_mu_beta: The a priori group-specific variance of the exponential rates. Since exp(mu_beta) is Gamma distributed, the rate of the distribution is just mean divided by variance and the shape is the mean times the rate. The default is set to 100.

  • mean_of_mu_gamma: This is the a priori expected mean parameter of the encoding and response execution times, which follow a normal distribution truncated from below at zero, so E(mu_gamma) < E(gamma). The default is 0.

  • var_of_mu_gamma: The a priori group-specific variance of the mean parameter. Its default is 10.

  • mean_of_omega_sqr: This is the a priori expected residual variance (E(omega^2)). Its distribution differs from the one used in the paper. Here it is a Gamma distribution instead of an improper one. The default is 0.005.

  • var_of_omega_sqr: The a priori variance of the residual variance (Var(omega^2)). The default is 0.01. The default of the mean and variance is equivalent to a shape and rate of 0.0025 and 0.5, respectivly.

  • df_of_sigma_sqr: A priori degrees of freedom for the individual variance of the response executions. The individual variance has a scaled inverse chi-squared prior with df_of_sigma_sqr degrees of freedom and omega^2 as scale. 2 is the default and it should be an integer.

  • sf_of_scale_matrix_SIGMA: The original scaling matrix (S) of the (scaled) inverse Wishart distribution for the process related parameters is an identity matrix S=I. sf_of_scale_matrix_SIGMA is a scaling factor, that scales this matrix (S=sf_of_scale_matrix_SIGMA*I). Its default is 1.

  • sf_of_scale_matrix_GAMMA: The original scaling matrix (S) of the (scaled) inverse Wishart distribution for the encoding and motor execution parameters is an identity matrix S=I. sf_of_scale_matrix_GAMMA is a scaling factor, that scales this matrix (S=sf_of_scale_matrix_GAMMA*I). Its default is 1.

  • prec_epsilon: This is epsilon in the paper. It is the precision of mu_alpha and all xi (scaling parameter in the scaled inverse Wishart distribution). Its default is also 1.

  • add_df_to_invWish: If P is the number of parameters or rather the size of the scale matrix used in the (scaled) inverse Wishart distribution then add_df_to_invWish is the number of degrees of freedom that can be added to it. So DF = P + add_df_to_invWish. The default for add_df_to_invWish is 1, such that the correlations are uniformly distributed within [-1, 1].

indices

Model selection indices. If set to TRUE the log-likelihood for each iteration and trial will be stored temporarily and with that the WAIC and LOO will be calculated via the loo package. If you want to have this log-likelihood matrix stored in the output of this function, you can set save_log_lik to TRUE. The default for indices is FALSE.

save_log_lik

If set to TRUE and indices = TRUE the log-likelihood matrix for each iteration and trial will be saved in the output as a matrix. Its default is FALSE.

old_label

If set to TRUE the old labels of "subj" and "group" of the data will be used in the elements of the output list. Default is FALSE.

Value

A list of the class rtmpt_fit containing

  • samples: the posterior samples as an mcmc.list object,

  • diags: some diagnostics like deviance information criterion, posterior predictive checks for the frequencies and latencies, potential scale reduction factors, and also the 99% and 95% HDIs and medians for the group-level parameters,

  • specs: some model specifications like the model, arguments of the model call, and information about the data transformation,

  • indices (optional): if enabled, WAIC and LOO,

  • LogLik (optional): if enabled, the log-likelihood matrix used for WAIC and LOO.

  • summary includes posterior mean and median of the main parameters.

Author(s)

Raphael Hartmann

References

Hartmann, R., Johannsen, L., & Klauer, K. C. (2020). rtmpt: An R package for fitting response-time extended multinomial processing tree models. Behavior Research Methods, 52(3), 1313–1338.

Hartmann, R., & Klauer, K. C. (2020). Extending RT-MPTs to enable equal process times. Journal of Mathematical Psychology, 96, 102340.

Klauer, K. C. (2010). Hierarchical multinomial processing tree models: A latent-trait approach. Psychometrika, 75(1), 70-98.

Klauer, K. C., & Kellen, D. (2018). RT-MPTs: Process models for response-time distributions based on multinomial processing trees with applications to recognition memory. Journal of Mathematical Psychology, 82, 111-130.

Spiegelhalter, D. J., Best, N. G., Carlin, B. P., & Van Der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the royal statistical society: Series b (statistical methodology), 64(4), 583-639.

Vehtari, A., Gelman, A., & Gabry, J. (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27(5), 1413-1432.

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research, 11(Dec), 3571-3594.

Examples

####################################################################################
# Detect-Guess variant of the Two-High Threshold model.
# The encoding and motor execution times are assumed to be equal for each response.
####################################################################################

mdl_2HTM <- "
# targets
do+(1-do)*g
(1-do)*(1-g)

# lures
(1-dn)*g
dn+(1-dn)*(1-g)

# do: detect old; dn: detect new; g: guess
"

model <- to_rtmpt_model(mdl_file = mdl_2HTM)

data_file <- system.file("extdata/data.txt", package="rtmpt")
data <- read.table(file = data_file, header = TRUE)
data_list <- to_rtmpt_data(raw_data = data, model = model)

# This might take some time
rtmpt_out <- fit_rtmpt(model = model, data = data_list, Rhat_max = 1.1)
rtmpt_out

# Type ?SimData for another working example.

rtmpt documentation built on April 10, 2022, 5:05 p.m.