data_loglik: Marginal loglikelihood of the observed data

View source: R/data_loglik.R

data_loglikR Documentation

Marginal loglikelihood of the observed data

Description

Compute observed data marginal loglikihood.

  • When there are only standalone items, this computes the regular loglik of the observed data.

  • When there are cluster items, this computes the marginal loglik where the word marginal means integrating out the nuisance dimension from the conditional likelihood of the cluster items.

    Use '?MIRTutils-package' for more details, such as the context of the current package and models supported.

Usage

data_loglik(
  theta,
  SA_dat = NULL,
  Cluster_dat = NULL,
  SA_parm = NULL,
  Cluster_parm = NULL,
  Dv = 1,
  n.nodes = 21,
  return_additional = TRUE,
  missing_as_incorrect = F
)

Arguments

theta

a scalar or a vector of examinee ability

SA_dat

For one examinee, a vector of response to standalone items. For more than one examinee, a matrix or dataframe of response to standalone items. One assertion per column. Column order must match row order in SA_parm. Use NA for missing responses

Cluster_dat

For one examinee, a vector of response to cluster items. For more than one examinee, a matrix or dataframe of response to cluster items. One assertion per column. Column order must match row order in Cluster_parm. Use NA for missing responses.

SA_parm

a matrix or dataframe of item parameters for standalone items, where columns are a (slope), b1, b2, ..., b_k (difficulty or step difficulty), g (guessing), ItemID, and AssertionID. Columns must follow the above order. See example_SA_parm for an example. Use ?example_SA_parm for detailed column descriptions

Cluster_parm

a matrix or dataframe of item parameters for cluster items, where columns are a (slope), b (difficulty), cluster variance, cluster position, ItemID, and AssertionID. Columns must follow the above order. See example_Cluster_parm for an example. Use ?example_Cluster_parm for detailed column descriptions

Dv

scaling factor for IRT model (usually 1 or 1.7)

n.nodes

number of nodes used when integrating out the nuisance dimension

return_additional

if TRUE, returns data loglikelihood with some additional by-product of the function in a list. See Value section for details

missing_as_incorrect

by default, missings (NAs) are treated as missing; if TRUE, missings are treated as incorrect

Value

If return_additional is FALSE, returns a dataframe with two columns: theta and marginalized data loglikelihood.

If return_additional is TRUE, returns the dataframe of loglikelihood with following additional tables in a list

  • probs.SA: probability of correct response for standalone items

  • probs.cluster: (conditional) probability table of correct response for clusters at each given nodes

  • parms: parameter tables in a list

Note

If the test does not have SA items or Cluster items, use default (NULL) for the corresponding data and parameter arguments

Author(s)

Zhongtian Lin lzt713@gmail.com

Examples

data(example_SA_parm)
data(example_Cluster_parm)
sigma <- diag(c(1, sqrt(unique(example_Cluster_parm$cluster_var))))
mu <- rep(0, nrow(sigma))
thetas <- MASS::mvrnorm(7,mu,sigma)
thetas[,1] <- seq(-3,3,1) #overall dimension theta values
itmDat <- sim_data(thetas = thetas, SA_parm = example_SA_parm, Cluster_parm = example_Cluster_parm)
SA_dat <- itmDat[,1:20]
Cluster_dat <- itmDat[,-1:-20]
rst <- data_loglik(thetas[,1], SA_dat, Cluster_dat, example_SA_parm, example_Cluster_parm, n.nodes = 11, return_additional = TRUE)
rst$loglik
rst$prob.SA
length(rst$probs.cluster) # a list conditional probabilities. The length of the list = number of clusters

woshikaqia/MIRTutils documentation built on Aug. 21, 2024, 4:30 p.m.