TSQE | R Documentation |
The function is used to estimate the Q-matrix based on the data (responses) using the two-step Q-matrix estimation method.
TSQE(
Y,
K,
input.cor = c("tetrachoric", "Pearson"),
ref.method = c("QR", "GDI"),
GDI.model = c("DINA", "ACDM", "RRUM", "GDINA"),
cutoff = 0.8
)
Y |
A |
K |
The number of attributes in the Q-matrix |
input.cor |
The type of correlation used to compute the input for the exploratory factor analysis. It could be the tetrachoric or Pearson correlation. |
ref.method |
The refinement method use to polish the provisional Q-matrix obtained from the EFA. Currently available methods include the Q-matrix refinement (QR) method and the G-DINA discrimination index (GDI). |
GDI.model |
The CDM used in the GDI algorithm to fit the data. Currently available models include the DINA model, the ACDM, the RRUM, and the G-DINA model |
cutoff |
The cutoff used to dichotomize the entries in the provisional Q-matrix |
The function returns the estimated Q-matrix
The TSQE method merges the Provisional Attribute Extraction (PAE) algorithm with a Q-matrix refinement-and-validation method including the Q-Matrix Refinement (QR) Method and the G-DINA Model Discrimination Index (GDI). Specifically, the PAE algorithm relies on classic exploratory factor analysis (EFA) combined with a unique stopping rule for identifying a provisional Q-matrix, and the resulting provisional Q-Matrix will be "polished" with a refinement method to derive the final estimation of Q-matrix.
The initial step of the algorithm is to aggregating the collected Q-Matrix into an inter-item tetrachoric correlation matrix. The reason for using tetrachoric correlation is that the examinee responses are binary, so it is more appropriate than the Pearson product moment correlation coefficient. See Chiu et al. (2022) for details. The next step is to use factor analysis on the item-correlation matrix, and treat the extracted factors as proxies for the latent attributes. The third step concerns identifying which specific attributes are required for which item:
Initialize the item index as j = 1
.
Let l_{jk}
denote the loading of item j
on factor k
, where k = 1,2,...,K
.
Arrange the loadings in descending order. Define a mapping
function f(k) = t
, where t
is the order index.
Hence, l_{j(1)}
will indicate the maximum loading,
while l_{j(K)}
will indicate the minimum loading.
Define
p_j(t) = \frac{\sum_{h=1}^t l_{j(h)}^2}{\sum_{k=1}^K l_{jk}^2}
as the proportion of the communality of item j
accounted for
by the first t
factors.
Define
K_j = \min \{ t \mid p_j(t) \geq \lambda \}
,
where \lambda
is the cut-off value for the desired proportion
of item variance-accounted-for. Then, the ordered entries of the
provisional q-vector of item j
are obtained as
q_{j(t)}^* = \begin{cases}
1 & \text{if } t \leq K_j \\
0 & \text{if } t > K_j
\end{cases}
.
Identify q_j^* = (q_{j1}^*,q_{j2}^*,...,q_{jK}^*)
by rearranging the ordered entries of the q-vector using the inverse function k = f^{-1}(t)
.
Set j = j + 1
and repeat (2) to (6) until j = J
.
Then denote the provisional Q-matrix as \mathbf{Q}^*
.
This function implements the Q-matrix refinement method developed by Chiu (2013), which is also based on the aforementioned nonparametric classification methods (Chiu & Douglas, 2013). This Q-matrix refinement method corrects potential misspecified entries of the Q-matrix through comparisons of the residual sum of squares computed from the observed and the ideal item responses.
The algorithm operates by minimizing the RSS. Recall that Y_{ij}
is the observed response and \eta_{ij}
is the ideal response.
Then the RSS of item j
for examinee i
is defined as
RSS_{ij} = (Y_{ij} - \eta_{ij})^2
.
The RSS of item j
across all examinees is therefor
RSS_{j} = \sum_{i=1}^{N} (Y_{ij} - \eta_{ij})^2 = \sum_{m=1}^{2^k} \sum_{i \in C_{m}} (Y_{ij} - \eta_{jm})^2
where C_m
is the latent proficiency-class m
,
and N
is the number of examinees.
Chiu(2013) proved that the expectation of RSS_j
is minimized for
the correct q-vector among the 2^K - 1
candidates. Please see the
paper for the justification.
The GDI is an extension of de la Torre's (2008) \delta
-method,
which has a limitation that it cannot be used with CDMs that
devide examinees into more than two groups. In response to the limitation,
de la Torre and Chiu (2016) porposed to select that item attribute vector
which maximizes the weighted variance of the probabilities of a correct
response for the different groups defined as
\zeta_j^2 = \sum_{l=1}^{2^{K_j}} P(\alpha_{lj}) \left[ P(Y_{ij} = 1 \mid \alpha_{lj}) - \bar{P}_{j} \right]^2
where P(\alpha_{lj})
is the posterior probability for the proficiency class
\alpha_{lj}
, and \bar{P}_{j} = \sum_{l=1}^{2^{K_j}} P(\alpha_{lj})P(Y_{ij} = 1 \mid \alpha_{lj})
,
where l = 1,2,...,2^{K_j}
. De la Torre and Chiu (2016) called \zeta^2
the GDI, which can be applied to any CDM that can be reparameterized in
terms of the G-DINA model.
Chiu, C. Y. (2013). Statistical Refinement of the Q-matrix in Cognitive Diagnosis. Applied Psychological Measurement, 37(8), 598-618.
Chiu, C. Y., & Douglas, J. A. (2013). A nonparametric approach to cognitive diagnosis by proximity to ideal response patterns. Journal of Classification 30(2), 225-250.
de la Torre, J., & Chiu, C.-Y. (2016) A general method of empirical Q-matrix validation. Psychometrika, 81, 253-73.
de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45, 343-362.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.