NPC: Estimation of examinees' attribute profiles using the NPC...
In NPCDTools: The Nonparametric Classification Methods for Cognitive Diagnosis

NPC	R Documentation

Estimation of examinees' attribute profiles using the NPC method

Description

The function is used to estimate examinees' attribute profiles using the nonparametric classification (NPC) method (Chiu, & Douglas, 2013). It uses a distance-based algorithm on the observed item responses for classifying examiness. This function estimates attribute profiles using nonparametric approaches for both the "AND gate" (conjunctive) and the "OR gate" (disjunctive) cognitive diagnostic models. These algorithms select the attribute profile with the smallest loss function value (plain, weighted, or penalized Hamming distance, see below for details) as the estimate. If more than one attribute profiles have the smallest loss function value, one of them is randomly chosen.

Usage

NPC(
  Y,
  Q,
  gate = c("AND", "OR"),
  method = c("Hamming", "Weighted", "Penalized"),
  wg = 1,
  ws = 1
)

Arguments

`Y`	A matrix of binary responses. Rows represent persons and columns represent items. 1=correct, 0=incorrect.
`Q`	The Q-matrix of the test. Rows represent items and columns represent attributes. 1=attribute required by the item, 0=attribute not required by the item.
`gate`	A character string specifying the type of gate. It can be one of the following: "AND" The examinee needs to possess all required attributes of an item in order to answer it correctly. "OR" The examinee needs to possess only one of the required attributes of an item in order to answer it correctly.
`method`	The method of nonparametric estimation. "Hamming" The plain Hamming distance method "Weighted" The Hamming distance weighted by inversed item variance "Penalized" The Hamming distance weighted by inversed item variance and specified penalizing weights for guess and slip.
`wg`	Additional argument for the "penalized" method. It is the weight assigned to guessing in the DINA or DINO models. A large value of weight results in a stronger impact on Hamming distance (larger loss function values) caused by guessing.
`ws`	Additional input for the "penalized" method. It is the weight assigned to slipping in the DINA or DINO models. A large value of weight results in la stronger impact on Hamming distance (larger loss function values) caused by slipping.

Value

The function returns a series of outputs, including:

alpha.est: Estimated attribute profiles. Rows represent persons and columns represent attributes. 1=examinee masters the attribute, 0=examinee does not master the attribute.
est.ideal: Estimated ideal response to all items by all examinees. Rows represent persons and columns represent items. 1=correct, 0=incorrect.
est.class: The class number (row index in pattern) for each person's attribute profile. It can also be used for locating the loss function value in loss.matrix for the estimated attribute profile for each person.
n.tie: Number of ties in the Hamming distance among the candidate attribute profiles for each person. When we encounter ties, one of the tied attribute profiles is randomly chosen.
pattern: All possible attribute profiles in the search space.
loss.matrix: The matrix of the values for the loss function (the plain, weighted, or penalized Hamming distance). Rows represent candidate attribute profiles in the same order with the pattern matrix; columns represent different examinees.

NPC algorithm with three distacne methods

Proficiency class membership is determined by comparing an examinee's observed item response vector \boldsymbol{Y} with each of the ideal item response vectors of the realizable 2^K=M proficiency classes. The ideal item responses are a function of the Q-matrix and the attribute vectors characteristic of the different proficiency classes. Hence, an examinee’s proficiency class is identified by the attribute vector \boldsymbol{\alpha}_{m} underlying that ideal item response vector which is closest—or most similar—to an examinee’s observed item response vector. The ideal response to item j is the score that would be obtained by an examinee if no perturbation occurred.

Let \boldsymbol{\eta}_{i} denote the J-dimensional ideal item response vector of examinee i, and the \hat{\boldsymbol{\alpha}} of an examinee’s attribute vector is defined as the attribute vector underlying the ideal item response vector that among all ideal item response vectors minimizes the distance to an examinee’s observed item response vector: \hat{\boldsymbol{\alpha}} = \arg\min_{m \in \{1,2,\ldots,M\}} d(\boldsymbol{y_i}, \boldsymbol{\eta}_{m})

A distance measure often used for clustering binary data is the Hamming distance that simply counts the number of disagreements between two vectors: d_H(\boldsymbol{y,\eta}) = \sum_{j=1}^{J} | y_j - \eta_j |

If the different levels of variability in the item responses are to be incorporated, then the Hamming distances can be weighted, for example, by the inverse of the item sample variance, which allows for larger impact on the distance functions of items with smaller variance: d_{wH} (\boldsymbol{y,\eta}) = \sum_{j=1}^{J} \frac{1}{\overline{p_j}(1-\overline{p_j})} |y_j-\eta_j|

Weighting weighting differently for departures from the ideal response model that would result from slips versus guesses is also considered: d_{gs}(\boldsymbol{y,\eta})=\sum_{j=1}^{J} w_gI[y_j=1]|y_j-\eta_j|+\sum_{j=1}^{J}w_sI[y_j=0]|y_j-\eta_j|

References

Chiu, C. (2011). Flexible approaches to cognitive diagnosis: nonparametric methods and small sample techniques. Invited session of cognitive diagnosis and item response theory at 2011 Joint Statistical Meeting.

Chiu, C. Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification 30(2), 225-250.

Examples

# Generate item and examinee profiles

natt <- 3
nitem <- 4
nperson <- 5
Q <- rbind(c(1, 0, 0), c(0, 1, 0), c(0, 0, 1), c(1, 1, 1))
alpha <- rbind(c(0, 0, 0), c(1, 0, 0), c(0, 1, 0), c(0, 0, 1), c(1, 1, 1))

# Generate DINA model-based response data
slip <- c(0.1, 0.15, 0.2, 0.25)
guess <- c(0.1, 0.15, 0.2, 0.25)
my.par <- list(slip=slip, guess=guess)
data <- matrix(NA, nperson, nitem)
eta <- matrix(NA, nperson, nitem)
for (i in 1:nperson) {
  for (j in 1:nitem) {
    eta[i, j] <- prod(alpha[i,] ^ Q[j, ])
    P <- (1 - slip[j]) ^ eta[i, j] * guess[j] ^ (1 - eta[i, j])
    u <- runif(1)
    data[i, j] <- as.numeric(u < P)
    }
}

# Using the function to estimate examinee attribute profile
alpha.est.NP.H <- NPC(data, Q, gate="AND", method="Hamming")
alpha.est.NP.W <- NPC(data, Q, gate="AND", method="Weighted")
alpha.est.NP.P <- NPC(data, Q, gate="AND", method="Penalized", wg=2, ws=1)

nperson <- 1   # Choose an examinee to investigate
print(alpha.est.NP.H) # Print the estimated examinee attribute profiles

NPCDTools documentation built on Sept. 23, 2024, 5:10 p.m.