normalLink: Link function using a normal regression.
In ralmond/CPTtools: Tools for Creating Conditional Probability Tables

normalLink

R Documentation

Link function using a normal regression.

Description

This link function assumes that the effective theta for this distribution defines the mean of a normal distribution in a generalized regression model. The link scale parameter describes the residual variance.

Usage

normalLink(et, linkScale = NULL, obsLevels = NULL)

Arguments

`et`	A matrix of effective theta values. There should be one row in this table for each configuration of the parent variables of the conditional probability table and one column for each state of the child variables except for the last.
`linkScale`	The residual standard deviation parameter. This value must be supplied and should be a positive number; the default value of `NULL` generates an error.
`obsLevels`	An optional character vector giving the names of the child variable states. If supplied, it should have length `ncol(et)+1`.

Details

This function takes care of the third step in the algorithm of calcDPCTable. Its input is a matrix of effective theta values (comparable to the last column of the output of eThetaFrame), one column for each of the child variable states (obsLevels) except for the last one. Each row represents a different configuration of the parent variables. The output is the conditional probability table. The use of this function makes calcDPCTable behave like calcDNTable.

The idea behind this link function was first proposed in Almond (2010), and it is more completely described in Almond et al. (2015). The motivation comes from assuming that the child variable is created by taking cuts on an underlying continuous variable. The marginal distribution of this variable is a standard normal. The conditional distribution is like a regression prediction with the effective theta from the parent variables (the et argument) as the expected value and the linkScale parameter as the residual standard deviation.

The calculation works as follows: First cut points are set related to the categories of the child variable. Let m be the number of categories (this should be one more than the number of columns of et) and the length of obsLevels if that is supplied). Then the cut points are set at cuts <- qnorm(((m - 1):1)/m).

Then for each row of the conditional probability table i, the probability of being in state k is calculated by pnorm(cuts[k]-et[i, 1])/linkScale) - pnorm(cuts[k-1]-et[i, 1])/linkScale) with the pnorm expression set to 0 or 1 at the endpoints. Note that only the first column of et is used in the calculation.

Value

A matrix with one more column than et giving the conditional probabilities for each configuration of the parent variables (which correspond to the rows).

Note

The motivation for the normal link function originally came from the observation of odd behavior when variables given a DiBello-Samejima distribution (that is, using the gradedResponse link function) were used as parent variables for other variables with a DiBello-Samejima distribution.

One potential reason for the odd behavior was that the graded response link function was not an inverse of the procedure used to assign the effectiveThetas to the parent variables. Thus, using a probit link function (normalLink) was thought to be better for parent variables than using a logistic link function (gradedResponse), at the same time the convention of assigning parent values based on quantiles of the normal distribution started. This made the normalLink and effectiveThetas approximate inverses (information is still lost through discritization). Note that in the current implementation the scale factor of 1.7 has been added to both the partialCredit and gradedResponse functions to make the logistic function closer to the normal distribution and a better inverse for the effective theta procedure.

Author(s)

Russell Almond

References

Almond, R. G. (2010). ‘I can name that Bayesian network in two matrixes.’ International Journal of Approximate Reasoning. 51, 167-178.

Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.

Examples


skill1l <- c("High","Medium","Low") 
correctL <- c("Correct","Incorrect") 
pcreditL <- c("Full","Partial","None")
gradeL <- c("A","B","C","D","E") 

## Get some effective theta values.
et <- effectiveThetas(3)

normalLink(matrix(et,ncol=1),.5,correctL)
normalLink(matrix(et,ncol=1),.3,correctL)
normalLink(matrix(et,nrow=3,ncol=2),.5,pcreditL)
normalLink(matrix(et,nrow=3,ncol=2),.8,pcreditL)

normalLink(matrix(et,nrow=3,ncol=4),.5,gradeL)
normalLink(matrix(et,nrow=3,ncol=4),.25,gradeL)

ralmond/CPTtools documentation built on Dec. 27, 2024, 7:15 a.m.