areaProbs: Translates between normal and categorical probabilities

areaProbsR Documentation

Translates between normal and categorical probabilities

Description

Maps between a continuous (normal) variable and a discrete variable by establishing a set of bins to maintain a particular probability vector. The pvecToCutpoints function returns the cut points separating the bins, the pvecToMidpoints returns a central point from each bin, and the areaProbs calculates the fraction of a normal curve falling into a particular bin.

Usage

pvecToCutpoints(pvec, mean = 0, std = 1)
pvecToMidpoints(pvec, mean = 0, std = 1)
areaProbs(pvec, condmean, condstd, mean = 0, std = 1)

Arguments

pvec

A vector of marginal probabilities for the categories of the discrete variable. Elements should be ordered from smallest to largest.

mean

The mean of the continuous variable.

std

The standard deviation of the continuous variable.

condmean

The conditional mean of the continuous variable.

condstd

The conditional standard deviation of the continuous variable.

Details

Let S be a discrete variable whose states s_k are given by names(pvec)[k] and for which the marginal probability Pr(S=s_k) = p_k is given by pvec[k]. Let Y be a continuous normal variable with mean mean and standard deviation std. These function map between S and Y.

The function pvecToCutpoints produces a series of cutpoints, c_k, such that setting s_k to S when c_k \le Y \le c_{k+1} produces the marginal probability specified by pvec. Note that c_1 is always -Inf and c_{K+1} is always Inf (where K is length(pvec)).

The function pvecToMidpoints produces the midpoints (with respect to the normal density) of the intervals defined by pvecToCutpoints. In particular, if Pr(S \ge s_k) = P_k, then the values returned are \code{qnorm}(P_k + p_k / 2).

The function areaProbs inverts these calculations. If condmean is E[Y|x] and condstd is \sqrt{var(Y|x)}, then this function calculates Pr(S|x) by calculating the area under the normal curve.

Value

For pvecToCutpoints, a vector of length one greater than pvec giving the endpoints of the bins. Note that the first and last values are always infinite.

For pvecToCutpoints, a vector of length the same length as pvec giving the midpoint of the bins.

For areaProbs a vector of probabilities of the same length as pvec.

Warning

Variables are given from lowest to highest state, for example ‘Low’, ‘Medium’, ‘High’. StatShop expects variables in the opposite order.

Note

The function effectiveThetas does something similar, but assumes all probability values are equally weighted.

Author(s)

Russell Almond

References

Almond, R.G., Mislevy, R.J., Steinberg, L.S., Yan, D. and Williamson, D.M. (2015) Bayesian Networks in Educational Assessment. Springer. Chapter 8.

Almond, R.G. ‘I Can Name that Bayesian Network in Two Matrixes.’ International Journal of Approximate Reasoning, 51, 167–178.

See Also

effectiveThetas

Examples

probs <- c(Low=.05,Med=.9,High=.05)
cuts <- pvecToCutpoints(probs)
mids <- pvecToMidpoints(probs)


areaProbs(probs,1,.5)



ralmond/CPTtools documentation built on Dec. 27, 2024, 7:15 a.m.