calcDDTable: Calculates DiBello-Dirichlet model probability and parameter...
In ralmond/CPTtools: Tools for Creating Conditional Probability Tables

calcDDTable

R Documentation

Calculates DiBello–Dirichlet model probability and parameter tables

Description

The DiBello–Dirichlet model creates a hyper-Dirichlet prior distribution by interpolating between an masterProfile and a noviceProfile. This function builds the hyper-Dirichlet parameter table, or with normalization, the conditional probability table for this distribution type.

Usage

calcDDTable(skillLevels, obsLevels, skillWeights, masterProfile,
            noviceProfile = 0.5, rule = "Compensatory")
calcDDFrame(skillLevels, obsLevels, skillWeights, masterProfile,
            noviceProfile = 0.5, rule = "Compensatory")

Arguments

`skillLevels`	A list of character vectors giving names of levels for each of the condition variables.
`obsLevels`	A character vector giving names of levels for the output variables from highest to lowest. As a special case, can also be a vector of integers.
`skillWeights`	A numeric vector of the same length as `skillLevels` giving the weight to be applied to each skill.
`masterProfile`	The Dirichlet prior for “experts” (see Details). Its length should match `obsLevels`.
`noviceProfile`	The Dirichlet prior for “novices” (see Details). Its length should match `obsLevels` or as a special case a scalar quantity gives a uniform prior. Default is uniform prior with weight 1/2.
`rule`	Function for computing effective theta (see Details).

Details

Assume for the moment that there are two possible skill profiles: “expert” and “novice”. This model presumes a conditional probability table for the outcome given skill profile with two rows each of which is an independent categorical distribution. The natural conjugate prior is an independent Dirichlet distribution for each row. The parameters for this distribution are given in the masterProfile and noviceProfile arguments.

If there is more than one parent variable or if the parent variable has more than one state, the situation becomes muddier. The “expert” state is obviously the one with all the variables at the highest levels and the “novice” is the one with all variables at the lowest levels. If we can assign an integer between 0 and 1 to each of the intermediate states, then we can interpolate between them to produce Dirichlet priors for each row.

This distribution type uses the DiBello effective theta technique to come up with the interpolation. Each parent variable state is assigned a ‘theta’ value using the effectiveThetas function to assign a numeric value to each one. These are then combined using the function rule in the rule argument. The resulting theta values are then scaled to a range of 0–1. The prior for that row is a weighted combination of the masterProfile and noviceProfile.

The combination of the individual effective theta values into a joint value for effective theta is done by the function reference by rule. This should be a function of three arguments: theta — the vector of effective theta values for each parent, alphas — the vector of discrimination parameters, and beta — a scalar value giving the difficulty. The initial distribution supplies three functions appropriate for use with calcDSTable: Compensatory, Conjunctive, and Disjunctive. Note that the beta argument is effectively ignored because of the later scaling of the output.

Normally obslevel should be a character vector giving state names. However, in the special case of state names which are integer values, R will “helpfully” convert these to legal variable names by prepending a letter. This causes other functions which rely on the names() of the result being the state names to break. As a special case, if the value of obsLevel is of type numeric, then calcDSFrame() will make sure that the correct values are preserved.

Value

For calcDDTable, a matrix whose rows correspond configurations of the parent variable states (skillLevels) and whose columns correspond to obsLevels. Each row of the table is the parameters of a Dirichlet distribution, so the whole matrix is the parameters for a hyper-Dirichlet distribution. The order of the parent rows is the same as is produced by applying expand.grid to skillLevels.

For calcDDFrame a CPF, a data frame with additional columns corresponding to the entries in skillLevels giving the parent value for each row.

Note

Unlike calcDSTable, there is not a corresponding DiBello-Dirichlet distribution support in StatShop. This function is used to model the parameters of an unconstrained hyper-Dirichlet distribution.

This was originally designed for use in Situational Judgment Tests where experts might not agree on the “key”.

Note: Zeros in the masterProfile indicate responses that a master would never make. They will result in zero probability of mastery for any response which yields that outcome.

References

Almond, R.G. and Roberts, R. (Draft) Bayesian Scoring for Situational Judgment Tests. Unpublished white paper.

Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.

Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.

Examples


  skill1l <- c("High","Medium","Low") 
  skill2l <- c("High","Low")
  option5L <- c("A","B","C","D","E") 

  ## Expert responses
  eProfile <- c(A=7,B=15,C=3,D=0,E=0)

  paramT <- calcDDTable(list(S1=skill1l,S2=skill2l), option5L,
                        c(S1=2,S2=1), masterProfile=eProfile+0.5)

  paramF <- calcDDFrame(list(S1=skill1l,S2=skill2l), option5L,
                        c(S1=2,S2=1), masterProfile=5*eProfile+0.5,
                        noviceProfile=2)

ralmond/CPTtools documentation built on Dec. 27, 2024, 7:15 a.m.