data.cdm: Several Datasets for the 'CDM' Package
In CDM: Cognitive Diagnosis Modeling

data.cdm

R Documentation

Several Datasets for the CDM Package

Description

Several datasets for the CDM package

Usage

data(data.cdm01)
data(data.cdm02)
data(data.cdm03)
data(data.cdm04)
data(data.cdm05)
data(data.cdm06)
data(data.cdm07)
data(data.cdm08)
data(data.cdm09)
data(data.cdm10)

Format

Dataset data.cdm01

This dataset is a multiple choice dataset and used in the mcdina function. The format is:

List of 3
$ data :'data.frame':
..$ I1 : int [1:5003] 3 3 4 1 1 1 1 1 1 1 ...
..$ I2 : int [1:5003] 1 1 3 1 1 2 1 1 2 1 ...
..$ I3 : int [1:5003] 4 3 2 3 2 2 2 2 1 2 ...
..$ I4 : int [1:5003] 3 3 3 2 2 2 2 3 3 1 ...
..$ I5 : int [1:5003] 2 2 2 3 1 1 2 3 2 1 ...
..$ I6 : int [1:5003] 3 1 1 1 1 2 1 1 1 1 ...
..$ I7 : int [1:5003] 1 1 2 2 1 3 1 1 1 3 ...
..$ I8 : int [1:5003] 1 1 1 1 1 2 1 4 3 3 ...
..$ I9 : int [1:5003] 3 2 1 1 1 1 3 3 1 3 ...
..$ I10: int [1:5003] 2 1 2 1 1 2 2 2 2 1 ...
..$ I11: int [1:5003] 2 2 2 2 1 2 1 2 1 1 ...
..$ I12: int [1:5003] 1 2 1 1 2 1 1 1 1 2 ...
..$ I13: int [1:5003] 2 1 1 1 2 1 2 2 1 1 ...
..$ I14: int [1:5003] 1 1 1 1 1 2 1 1 2 1 ...
..$ I15: int [1:5003] 1 2 1 1 1 1 1 1 1 1 ...
..$ I16: int [1:5003] 1 2 2 1 2 2 2 1 1 1 ...
..$ I17: int [1:5003] 1 1 1 1 1 1 1 1 1 1 ...
$ group : int [1:5003] 1 1 1 1 1 1 1 1 1 1 ...
$ q.matrix:'data.frame':
..$ item : int [1:52] 1 1 1 1 2 2 2 2 3 3 ...
..$ categ: int [1:52] 1 2 3 4 1 2 3 4 1 2 ...
..$ A1 : int [1:52] 0 1 0 1 0 1 1 1 0 0 ...
..$ A2 : int [1:52] 0 0 1 1 0 0 0 1 0 0 ...
..$ A3 : int [1:52] 0 0 0 0 0 0 0 0 0 0 ...
Dataset data.cdm02

Multiple choice dataset with a Q-matrix designed for polytomous attributes.

List of 2
$ data :'data.frame':
..$ I1 : int [1:3000] 3 3 4 1 1 1 1 1 1 1 ...
..$ I2 : int [1:3000] 1 1 3 1 1 2 1 1 2 1 ...
..$ I3 : int [1:3000] 4 3 2 3 2 2 2 2 1 2 ...
[...]
..$ B17: num [1:3000] 1 1 1 1 1 1 1 1 1 1 ...
..$ B18: num [1:3000] 1 1 1 1 2 2 2 2 2 2 ...
$ q.matrix:'data.frame':
..$ item : int [1:100] 1 1 1 1 2 2 2 2 3 3 ...
..$ categ: int [1:100] 1 2 3 4 1 2 3 4 1 2 ...
..$ A1 : num [1:100] 0 1 0 1 0 1 1 1 0 0 ...
..$ A2 : num [1:100] 0 0 1 1 0 0 0 1 0 0 ...
..$ A3 : num [1:100] 0 0 0 0 0 0 0 0 0 0 ...
..$ B1 : num [1:100] 0 0 0 0 0 0 0 0 0 0 ...
Dataset data.cdm03:

This is a resimulated dataset from Chiu, Koehn and Wu (2016) where the data generating model is a reduced RUM model. See Example 1.

List of 2
$ data : num [1:725, 1:16] 0 1 1 1 1 1 1 1 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:16] "I01" "I02" "I03" "I04" ...
$ qmatrix:'data.frame': 16 obs. of 6 variables:
..$ item: Factor w/ 16 levels "I01","I02","I03",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ A1 : int [1:16] 1 0 0 0 0 0 0 0 1 1 ...
..$ A2 : int [1:16] 0 1 0 0 1 1 0 0 0 0 ...
..$ A3 : int [1:16] 0 0 1 1 1 1 0 0 0 0 ...
..$ A4 : int [1:16] 0 0 0 0 0 0 1 1 1 1 ...
..$ A5 : int [1:16] 0 0 0 0 0 0 0 0 0 0 ...
Dataset data.cdm04:

Simulated dataset for the sequential DINA model (as described in Ma & de la Torre, 2016). The dataset contains 1000 persons and 12 items which measure 2 skills.

List of 3
$ data : num [1:1000, 1:12] 0 0 0 1 1 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:12] "I1" "I2" "I3" "I4" ...
$ q.matrix1:'data.frame': 18 obs. of 4 variables:
..$ Item: chr [1:18] "I1" "I2" "I3" "I4" ...
..$ Cat : int [1:18] 1 1 1 1 1 1 1 2 1 2 ...
..$ A1 : int [1:18] 1 1 1 0 0 0 1 1 1 1 ...
..$ A2 : int [1:18] 0 0 0 1 1 1 0 0 0 0 ...
$ q.matrix2:'data.frame': 18 obs. of 4 variables:
..$ Item: chr [1:18] "I1" "I2" "I3" "I4" ...
..$ Cat : int [1:18] 1 1 1 1 1 1 1 2 1 2 ...
..$ A1 : num [1:18] 1 1 1 0 0 0 1 1 1 1 ...
..$ A2 : num [1:18] 0 0 0 1 1 1 0 0 0 0 ...
Dataset data.cdm05:

Example dataset used in Philipp, Strobl, de la Torre and Zeileis (2018). This dataset is a sub-dataset of the probability dataset in the pks package (Heller & Wickelmaier, 2013).

List of 3
$ data :'data.frame': 504 obs. of 12 variables:
..$ b101: num [1:504] 1 1 1 1 1 1 1 1 1 1 ...
..$ b102: num [1:504] 1 1 1 1 1 1 1 1 1 1 ...
..$ b103: num [1:504] 1 1 1 1 1 1 1 1 1 1 ...
..$ b104: num [1:504] 1 1 1 1 0 1 0 0 0 1 ...
..$ b105: num [1:504] 1 0 1 1 1 1 0 1 1 1 ...
..$ b106: num [1:504] 1 1 1 1 1 1 1 1 1 1 ...
..$ b107: num [1:504] 1 1 1 1 1 1 1 1 1 1 ...
..$ b108: num [1:504] 1 1 1 1 1 1 0 1 1 1 ...
..$ b109: num [1:504] 1 1 0 1 1 0 0 1 1 0 ...
..$ b110: num [1:504] 0 0 0 1 0 0 0 0 0 1 ...
..$ b111: num [1:504] 0 1 0 0 0 1 0 0 0 0 ...
..$ b112: num [1:504] 1 1 0 1 0 1 0 1 0 0 ...
$ q.matrix:'data.frame': 12 obs. of 4 variables:
..$ pb: num [1:12] 1 0 0 0 1 1 1 1 1 0 ...
..$ cp: num [1:12] 0 1 0 0 1 1 0 0 0 1 ...
..$ un: num [1:12] 0 0 1 0 0 0 1 1 0 0 ...
..$ id: num [1:12] 0 0 0 1 0 0 0 0 1 1 ...
$ skills : Named chr [1:4] "how to calculate the classic probability "
..- attr(*, "names")=chr [1:4] "pb" "cp" "un" "id"
Dataset data.cdm06:

Resimulated example dataset from Chen and Chen (2017).

List of 3
$ data :'data.frame': 2733 obs. of 15 variables:
..$ I01: num [1:2733] 1 0 0 1 0 0 0 1 1 1 ...
..$ I02: num [1:2733] 1 0 0 1 1 0 1 0 0 1 ...
..$ I03: num [1:2733] 0 0 0 1 1 0 1 0 1 0 ...
..$ I04: num [1:2733] 1 1 0 0 0 0 1 1 1 0 ...
..$ I05: num [1:2733] 1 0 1 1 0 1 1 1 1 1 ...
..$ I06: num [1:2733] 0 0 0 1 1 0 0 0 1 1 ...
..$ I07: num [1:2733] 1 1 1 0 0 1 1 0 1 1 ...
..$ I08: num [1:2733] 0 0 0 0 0 0 0 0 1 1 ...
..$ I09: num [1:2733] 1 0 0 1 1 1 0 1 0 1 ...
..$ I10: num [1:2733] 0 0 0 1 0 1 1 0 1 1 ...
..$ I11: num [1:2733] 0 1 0 1 1 1 1 0 1 1 ...
..$ I12: num [1:2733] 0 1 0 1 0 0 0 1 1 1 ...
..$ I13: num [1:2733] 0 0 1 1 0 1 0 0 0 1 ...
..$ I14: num [1:2733] 0 0 0 1 1 0 1 1 0 0 ...
..$ I15: num [1:2733] 0 0 0 1 0 0 1 0 1 1 ...
$ q.matrix:'data.frame': 15 obs. of 5 variables:
..$ RI: num [1:15] 1 1 1 0 1 1 1 1 0 0 ...
..$ JS: num [1:15] 1 0 0 1 0 0 0 0 0 1 ...
..$ GI: num [1:15] 0 1 0 1 0 0 1 1 1 1 ...
..$ II: num [1:15] 0 1 1 0 1 0 1 0 0 0 ...
..$ MI: num [1:15] 0 0 1 0 0 0 0 0 1 0 ...
$ skills : chr [1:5, 1:2] "Retrieving explicit information " ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:5] "RI" "JS" "GI" "II" ...
.. ..$ : chr [1:2] "skill" "description"
Dataset data.cdm07:

This is a resimulated dataset from the social anxiety disorder data concerning social phobia which involve 13 dichotomous questions (Fang, Liu & Ling, 2017). The simulation was based on a latent class model with five classes. The dataset was also used in Chen, Li, Liu and Ying (2017).

$ data : num [1:863, 1:13] 1 0 1 1 1 1 1 1 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:13] "I1" "I2" "I3" "I4" ...
$ q.matrix: num [1:13, 1:3] 1 1 1 1 0 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:13] "I1" "I2" "I3" "I4" ...
.. ..$ : chr [1:3] "A1" "A2" "A3"
$ items : atomic [1:13] 1 speaking in front of other people? ...
..- attr(*, "stem")=chr "Have you ever had a strong fear or avoidance of ..."
Dataset data.cdm08:

This is a simulated dataset involving four skills and three misconceptions for the model for simultaneously identifying skills and misconceptions (SISM; Kuo, Chen & de la Torre, 2018). The Q-matrix follows the specification in their simulation study.

List of 2
$ data :'data.frame': 1300 obs. of 20 variables:
..$ I01: num [1:1300] 1 0 0 1 1 1 1 1 1 1 ...
..$ I02: num [1:1300] 0 0 0 0 1 1 1 1 1 1 ...
..$ I03: num [1:1300] 0 0 0 0 1 1 1 1 1 1 ...
..$ I04: num [1:1300] 1 1 0 1 0 1 1 0 1 1 ...
..$ I05: num [1:1300] 1 1 1 0 1 1 0 1 1 1 ...
..[...]
..$ I18: num [1:1300] 0 1 0 0 0 0 0 0 0 1 ...
..$ I19: num [1:1300] 1 1 0 0 0 0 0 1 1 1 ...
..$ I20: num [1:1300] 1 1 0 0 0 1 0 1 0 1 ...
$ q.matrix:'data.frame': 20 obs. of 7 variables:
..$ S1: num [1:20] 1 0 0 0 0 0 0 1 0 0 ...
..$ S2: num [1:20] 0 1 0 0 0 0 0 0 1 0 ...
..$ S3: num [1:20] 0 0 1 0 0 0 0 0 0 1 ...
..$ S4: num [1:20] 0 0 0 1 0 0 0 0 0 0 ...
..$ B1: num [1:20] 0 0 0 0 1 0 0 1 1 0 ...
..$ B2: num [1:20] 0 0 0 0 0 1 0 0 0 0 ...
..$ B3: num [1:20] 0 0 0 0 0 0 1 0 0 1 ...
Dataset data.cdm09: This is a simulated dataset involving polytomous skills which is adapted from the empirical example (proportional reasoning data) of Chen and de la Torre (2013).

List of 2
$ data : num [1:500, 1:15] 1 0 1 1 0 1 1 1 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:15] "I1" "I2" "I3" "I4" ...
$ q.matrix:'data.frame': 15 obs. of 4 variables:
..$ A1: int [1:15] 0 0 0 0 2 0 0 2 1 1 ...
..$ A2: int [1:15] 1 0 2 0 0 1 2 0 1 1 ...
..$ A3: int [1:15] 0 0 0 1 0 0 0 0 0 0 ...
..$ A4: int [1:15] 0 1 1 0 0 0 0 0 0 0 ...
Dataset data.cdm10: This is a simulated dataset involving a hierarchical skill structure. Skill A has four levels, skill B possesses two levels and skill C has three levels.

List of 2
$ data : num [1:1500, 1:15] 1 1 0 0 0 1 1 0 0 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:15] "I1" "I2" "I3" "I4" ...
$ q.matrix: num [1:15, 1:6] 1 1 1 1 1 1 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:15] "I1" "I2" "I3" "I4" ...
.. ..$ : chr [1:6] "A1" "A2" "A3" "B1" ...

References

Chen, H., & Chen, J. (2017). Cognitive diagnostic research on chinese students' English listening skills and implications on skill training. English Language Teaching, 10(12), 107-115. http://dx.doi.org/10.5539/elt.v10n12p107

Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37, 419-437. http://dx.doi.org/10.1177/0146621613479818

Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660-692. http://dx.doi.org/10.1007/s11336-016-9545-6

Chiu, C.-Y., Koehn, H.-F., & Wu, H.-M. (2016). Fitting the reduced RUM with Mplus: A tutorial. International Journal of Testing, 16(4), 331-351. http://dx.doi.org/10.1080/15305058.2016.1148038

Fang, G., Liu, J., & Ying, Z. (2017). On the identifiability of diagnostic classification models. arXiv, 1706.01240. https://arxiv.org/abs/1706.01240

Heller, J. and Wickelmaier, F. (2013). Minimum discrepancy estimation in probabilistic knowledge structures. Electronic Notes in Discrete Mathematics, 42, 49-56.
http://dx.doi.org/10.1016/j.endm.2013.05.145

Kuo, B.-C., Chen, C.-H., & de la Torre, J. (2018). A cognitive diagnosis model for identifying coexisting skills and misconceptions. Applied Psychological Measurement, 42(3), 179-191. http://dx.doi.org/10.1177/0146621617722791

Ma, W., & de la Torre, J. (2016). A sequential cognitive diagnosis model for polytomous responses. British Journal of Mathematical and Statistical Psychology, 69(3), 253-275.
https://doi.org/10.1111/bmsp.12070

Philipp, M., Strobl, C., de la Torre, J., & Zeileis, A. (2018). On the estimation of standard errors in cognitive diagnosis models. Journal of Educational and Behavioral Statistics, 43(1), 88-115. http://dx.doi.org/10.3102/1076998617719728

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Reduced RUM model, Chiu et al. (2016)
#############################################################################

data(data.cdm03, package="CDM")
dat <- data.cdm03$data
qmatrix <- data.cdm03$qmatrix

#*** Model 1: Reduced RUM
mod1 <- CDM::gdina( dat, q.matrix=qmatrix[,-1], rule="RRUM" )
summary(mod1)

#*** Model 2: Additive model with identity link function
mod2 <- CDM::gdina( dat, q.matrix=qmatrix[,-1], rule="ACDM" )
summary(mod2)

#*** Model 3: Additive model with logit link function
mod3 <- CDM::gdina( dat, q.matrix=qmatrix[,-1], rule="ACDM", linkfct="logit")
summary(mod3)

#############################################################################
# EXAMPLE 2: GDINA model - probability dataset from the pks package
#############################################################################

data(data.cdm05, package="CDM")
dat <- data.cdm05$data
Q <- data.cdm05$q.matrix

#* estimate model
mod1 <- CDM::gdina( dat, q.matrix=Q )
summary(mod1)

## End(Not run)

CDM documentation built on Aug. 8, 2025, 6:12 p.m.