calcDNTable: Creates the probability table for DiBello-Normal distribution
In ralmond/CPTtools: Tools for Creating Conditional Probability Tables

calcDNTable

R Documentation

Creates the probability table for DiBello–Normal distribution

Description

The calcDNTable function takes a description of input and output variables for a Bayesian network distribution and a collection of IRT-like parameter (discrimination, difficulty) and calculates a conditional probability table using the DiBello–Normal distribution (see Details). The calcDNFrame function returns the value as a data frame with labels for the parent states.

Usage

calcDNTable(skillLevels, obsLevels, lnAlphas, beta, std, rule = "Compensatory")
calcDNFrame(skillLevels, obsLevels, lnAlphas, beta, std, rule = "Compensatory")

Arguments

`skillLevels`	A list of character vectors giving names of levels for each of the condition variables.
`obsLevels`	A character vector giving names of levels for the output variables from highest to lowest. Can also be a vector of integers (see Details).
`lnAlphas`	A vector of log slope parameters. Its length should be either 1 or the length of `skillLevels`, depending on the choice of `rule`.
`beta`	A vector of difficulty (-intercept) parameters. Its length should be either 1 or the length of `skillLevels`, depending on the choice of `rule`.
`std`	The log of the residual standard deviation (see Details).
`rule`	Function for computing effective theta (see Details).

Details

The DiBello–Normal distribution (Almond et al, 2015) is a variant of the DiBello–Samejima distribution (Almond et al, 2001) for creating conditional probability tables for Bayesian networks which uses a regression-like (probit) link function in place of Samejima's graded response link function. The basic procedure unfolds in three steps.

Each level of each input variable is assigned an “effective theta” value — a normal value to be used in calculations.
For each possible skill profile (combination of states of the parent variables) the effective thetas are combined using a combination function. This produces an “effective theta” for that skill profile.
Taking the effective theta value as the mean, the probability that the examinee will fall into each category.

The parent (conditioning) variables are described by the skillLevels argument which should provide for each parent variable in order the names of the states ranked from highest to lowest value. These are calculated through the function effectiveThetas which gives equally spaced points on the probability curve. Note that for the DiBello-Normal distribution, Step 1 and Step 3 are inverses of each other (except for rounding error).

The combination of the individual effective theta values into a joint value for effective theta is done by the function reference by rule. This should be a function of three arguments: theta — the vector of effective theta values for each parent, alphas — the vector of discrimination parameters, and beta — a scalar value giving the difficulty. The initial distribution supplies five functions appropriate for use with calcDSTable: Compensatory, Conjunctive, and Disjunctive, OffsetConjunctive, and OffsetDisjunctive. The last two have a slightly different parameterization: alpha is assumed to be a scalar and betas parameter is vector valued. Note that the discrimination and difficulty parameters are built into the structure function and not the probit curve.

The effective theta values are converted to probabilities by assuming that the categories for the consequence variable (obsLevels) are generated by taking equal probability divisions of a standard normal random variable. However, a person with a given pattern of condition variables is drawn from a population with mean at effective theta and standard deviation of exp(std). The returned numbers are the probabilities of being in each category.

Normally obslevel should be a character vector giving state names. However, in the special case of state names which are integer values, R will “helpfully” convert these to legal variable names by prepending a letter. This causes other functions which rely on the names() of the result being the state names to break. As a special case, if the value of obsLevel is of type numeric, then calcDNFrame() will make sure that the correct values are preserved.

Value

For calcDNTable, a matrix whose rows correspond configurations of the parent variable states (skillLevels) and whose columns correspond to obsLevels. Each row of the table is a probability distribution, so the whole matrix is a conditional probability table. The order of the parent rows is the same as is produced by applying expand.grid to skillLevels.

For calcDNFrame a CPF, a data frame with additional columns corresponding to the entries in skillLevels giving the parent value for each row.

Note

This distribution class was developed primarily for modeling relationships among proficiency variables. For models for observables, see calcDSTable.

This function has largely been superceeded by calls to calcDPCTable with normalLink as the link function.

Author(s)

Russell Almond

References

Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.

Almond, R.G., DiBello, L., Jenkins, F., Mislevy, R.J., Senturk, D., Steinberg, L.S. and Yan, D. (2001) Models for Conditional Probability Tables in Educational Assessment. Artificial Intelligence and Statistics 2001 Jaakkola and Richardson (eds)., Morgan Kaufmann, 137–143.

Examples

## Set up variables
skill1l <- c("High","Medium","Low") 
skill2l <- c("High","Medium","Low","LowerYet") 
skill3l <- c("Advanced","Proficient","Basic","Developing") 

cptSkill3 <- calcDNTable(list(S1=skill1l,S2=skill2l),skill3l,
                          log(c(S1=1,S2=.75)),1.0,log(0.5),
                          rule="Compensatory")

cpfSkill3 <- calcDNFrame(list(S1=skill1l,S2=skill2l),skill3l,
                          log(c(S1=1,S2=.75)),1.0,log(0.5),
                          rule="Compensatory")

ralmond/CPTtools documentation built on Dec. 27, 2024, 7:15 a.m.