View source: R/classification.R
HB.CA | R Documentation |
An implementation of what has been come to be known as the "Hanson and Brennan approach" to classification consistency and accuracy, which by employing a compound beta-binomial distribution assumes that true-scores conform to the four-parameter beta distribution, and errors of measurement to a two-term approximation of the compound binomial distribution. Under these assumptions, the expected classification consistency and accuracy of tests can be estimated from observed outcomes and test reliability.
HB.CA(
x = NULL,
reliability,
cut,
testlength,
true.model = "4P",
truecut = NULL,
output = c("accuracy", "consistency"),
failsafe = TRUE,
l = 0,
u = 1,
modelfit = 10
)
x |
A vector of observed scores, or a list specifying parameter values. If a list is provided, the list entries must be named after the parameters: |
reliability |
The observed-score squared correlation (i.e., proportion of shared variance) with the true-score. |
cut |
The cutoff value for classifying observations into above/below categories. |
testlength |
The total number of test items (or maximum possible score). Must be an integer. |
true.model |
The probability distribution to be fitted to the moments of the true-score distribution. Options are |
truecut |
Optional specification of a "true" cutoff. Useful for producing ROC curves (see documentation for the |
output |
Character vector indicating which types of statistics (i.e, accuracy and/or consistency) are to be computed and included in the output. Permissible values are |
failsafe |
Logical value indicating whether to engage the automatic fail-safe defaulting to the two-parameter Beta true-score distribution if the four-parameter fitting procedure produces impermissible parameter estimates. Default is |
l |
If |
u |
If |
modelfit |
Allows for controlling the chi-square test for model fit by setting the minimum bin-size for expected observations. Can alternatively be set to |
A list containing the estimated parameters necessary for the approach (i.e., the effective test-length and the beta distribution parameters), a chi-square test of model-fit, the confusion matrix containing estimated proportions of true/false pass/fail categorizations for a test, diagnostic performance statistics, and / or a classification consistency matrix and indices. Accuracy output includes a confusion matrix and diagnostic performance indices, and consistency output includes a consistency matrix and consistency indices p
(expected proportion of agreement between two independent test administrations), p_c
(proportion of agreement on two independent administrations expected by chance alone), and Kappa
(Cohen's Kappa).
This implementation of the Hanson-Brennan approach is much slower than the implementation of the Livingston and Lewis approach, as there is no native implementation of Lord's two-term approximation to the Compound-Binomial distribution in R. This implementation uses a "brute-force" method of computing the cumulative probabilities from the compound-Binomial distribution, which will by necessity be more resource intensive.
Hanson, Bradley A. (1991). Method of Moments Estimates for the Four-Parameter Beta Compound Binomial Model and the Calculation of Classification Consistency Indexes. American College Testing.
Lord. Frederic M. (1965). A Strong True-Score Theory, With Applications. Psychometrika, 30(3).
Lewis, Don and Burke, C. J. (1949). The Use and Misuse of the Chi-Square Test. Psychological Bulletin, 46(6).
# Generate some fictional data. Say, 1000 individuals take a test with a
# maximum score of 50.
# Generate some fictional data. Say, 1000 individuals take a 20-item test.
set.seed(1234)
p.success <- rBeta.4P(1000, 0.15, 0.85, 6, 4)
for (i in 1:20) {
if (i == 1) {
rawdata <- matrix(nrow = 1000, ncol = 20)
}
rawdata[, i] <- rbinom(1000, 1, p.success)
}
# Suppose the cutoff value for attaining a pass is 10 items correct, and
# that the reliability of this test was estimated using the Cronbach's Alpha
# estimator. To estimate and retrieve the estimated parameters, confusion and
# consistency matrices, and accuracy and consistency indices using HB.CA():
HB.CA(x = rowSums(rawdata), reliability = cba(rawdata), cut = 10,
testlength = 20)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.