tam.mml.3pl: 3PL Structured Item Response Model in 'TAM'

Description Usage Arguments Details Value Note References See Also Examples

View source: R/tam.mml.3pl.R

Description

This estimates a 3PL model with design matrices for item slopes and item intercepts. Discrete distributions of the latent variables are allowed which can be log-linearly smoothed.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
tam.mml.3pl(resp, Y=NULL, group=NULL, formulaY=NULL, dataY=NULL, ndim=1,
  pid=NULL, xsi.fixed=NULL, xsi.inits=NULL, xsi.prior=NULL,
  beta.fixed=NULL, beta.inits=NULL, variance.fixed=NULL, variance.inits=NULL,
  est.variance=TRUE, A=NULL, notA=FALSE, Q=NULL, Q.fixed=NULL, E=NULL,
  gammaslope.des="2PL", gammaslope=NULL, gammaslope.fixed=NULL,
  est.some.slopes=TRUE, gammaslope.max=9.99, gammaslope.constr.V=NULL,
  gammaslope.constr.c=NULL, gammaslope.center.index=NULL,  gammaslope.center.value=NULL,
  gammaslope.prior=NULL, userfct.gammaslope=NULL, gammaslope.constr.Npars=0,
  est.guess=NULL, guess=rep(0, ncol(resp)),
  guess.prior=NULL, max.guess=0.5, skillspace="normal", theta.k=NULL,
  delta.designmatrix=NULL, delta.fixed=NULL, delta.inits=NULL, pweights=NULL,
  item.elim=TRUE, verbose=TRUE, control=list(), Edes=NULL )

## S3 method for class 'tam.mml.3pl'
summary(object,file=NULL,...)

## S3 method for class 'tam.mml.3pl'
print(x,...)

Arguments

resp

Data frame with polytomous item responses k=0,...,K. Missing responses must be declared as NA.

Y

A matrix of covariates in latent regression. Note that the intercept is automatically included as the first predictor.

group

An optional vector of group identifiers

formulaY

An R formula for latent regression. Transformations of predictors in Y (included in dataY) can be easily specified, e. g. female*race or I(age^2).

dataY

An optional data frame with possible covariates Y in latent regression. This data frame will be used if an R formula in formulaY is specified.

ndim

Number of dimensions (is not needed to determined by the user)

pid

An optional vector of person identifiers

xsi.fixed

A matrix with two columns for fixing ξ parameters. 1st column: index of ξ parameter, 2nd column: fixed value

xsi.inits

A matrix with two columns (in the same way defined as in xsi.fixed with initial value for ξ.

xsi.prior

Item-specific prior distribution for ξ parameters. It is assumed that ξ_k \sim N( μ_k, σ_k^2 ). The first column in xsi.prior is μ_k, the second is σ_k.

beta.fixed

A matrix with three columns for fixing regression coefficients. 1st column: Index of Y value, 2nd column: dimension, 3rd column: fixed β value.
If no constraints should be imposed on β, then set beta.fixed=FALSE (see Example 2, Model 2_4).

beta.inits

A matrix (same format as in beta.fixed) with initial β values

variance.fixed

An optional matrix. In case of a single group it is a matrix with three columns for fixing entries in covariance matrix: 1st column: dimension 1, 2nd column: dimension 2, 3rd column: fixed value of covariance/variance. In case of multiple groups, it is a matrix with four columns 1st column: group index (from 1,…,G, 2nd column: dimension 1, 3rd column: dimension 2, 4th column: fixed value of covariance

variance.inits

Initial covariance matrix in estimation. All matrix entries have to be specified and this matrix is NOT in the same format like variance.fixed.

est.variance

Should the covariance matrix be estimated? This argument applies to estimated item slopes in tam.mml.2pl. The default is FALSE which means that latent variables (in the first group) are standardized in 2PL estimation.

A

An optional array of dimension I \times (K+1) \times N_ξ. Only ξ parameters are estimated, entries in A only correspond to the design.

notA

An optional logical indicating whether all entries in the A matrix are set to zero and no item intercept ξ should be estimated.

Q

An optional I \times D matrix (the Q-matrix) which specifies the loading structure of items on dimensions.

Q.fixed

Optional I \times D matrix of the same dimensions like Q. Non NA entries contain values at which item loadings should be fixed to.

E

Optional design array for item slopes γ. It is a four dimensional array of size I \times (K+1) \times D \times N_γ containing items, categories, dimensions, γ parameter.

gammaslope.des

Optional string indicating type of item slope parameter to be estimated. gammaslope.des="2PL" estimates a slope parameter for an item, gammaslope.des="2PLcat" for an item and a

gammaslope

Initial or fixed vector of γ parameters

gammaslope.fixed

An optional matrix containing fixed values in the γ vector. First column: parameter index; second colunmn: fixed value.

est.some.slopes

An optional logical indicating whether some item slopes should be estimated.

gammaslope.max

Value for absolute entries in γ vector

gammaslope.constr.V

An optional constraint matrix V for item slope parameters γ

gammaslope.constr.c

An optional constraint vector c for item slope parameters γ

gammaslope.center.index

Indices of gammaslope parameters which should be fixed to sum specified in gammaslope.center.value (see Example 7).

gammaslope.center.value

Specified values of sum of subset of gammaslope parameters.

gammaslope.prior

Item-specific prior distribution for γ parameters. It is assumed that γ_k \sim N( μ_k, σ_k^2 ). The first column in gammaslope.prior is μ_k, the second is σ_k.

userfct.gammaslope

A user specified function for constraints or transformations of the γ parameters within the algorithm. See Example 17 in tam.mml.

gammaslope.constr.Npars

Number of constrained γ parameters in userfct.gammaslope

est.guess

An optional vector of integers indicating for which items a guessing parameter should be estimated. Same integers correspond to same estimated guessing parameters. A value of 0 denotes an item for which no guessing parameter is estimated.

guess

Fixed or initial guessing parameters

guess.prior

Item-specific prior distribution for guessing parameters c_i. It is assumed that c_i \sim Beta(a_i, b_i). The first column in gammaslope.prior is a_i, the second is b_i.

max.guess

Upper bound for guessing parameters

skillspace

Skill space: normal distribution ("normal") or discrete distribution ("discrete").

theta.k

A matrix of the \bold{θ} skill space in case of a discrete distribution (skillspace="discrete").

delta.designmatrix

A design matrix of the log-linear model for the skill space in case of a discrete distribution (skillspace="discrete").

delta.fixed

Fixed δ values of the log-linear skill space. delta.fixed must be a matrix with three columns. First column: δ parameter index, Second column: Group index, Third column: Fixed δ parameter value.

delta.inits

Optional initial matrix of δ parameters.

pweights

Optional vector of person weights.

item.elim

Optional logical indicating whether an item with has only zero entries should be removed from the analysis. The default is TRUE.

verbose

Logical indicating whether output should be printed during iterations. This argument replaces control$progress.

control

See tam.mml for more details.

Edes

Compact form of design matrix. This argument is only defined for convenience for models with random starting values to avoid recalculations.

object

Object of class tam.mml.3pl

file

A file name in which the summary output will be written

x

Object of class tam.mml.3pl

...

Further arguments to be passed

Details

The item response model for item i and category h and no guessing parameters can be written as

P( X_{i}=h | \bold{θ} ) \propto \exp( ∑_d b_{ihd} θ_d + ∑_k a_{ih} ξ_k )

For item slopes, a linear decomposition is allowed

b_{ihd}=∑_k e_{ihdk} γ_k

In case of a guessing parameter, the item response function is

P( X_{i}=h | \bold{θ} )=c_i + ( 1 - c_i ) \cdot ( 1 + \exp( - ∑_d b_{ihd} θ_d - ∑_k a_{ih} ξ_k ) )^{-1}

For possibilities of definitions of the design matrix E=(e_{ihdk}) see Formann (2007), for the strongly related linear logistic latent class model see Formann (1992). Linear constraints on γ can be specified by equations V γ=c and using the arguments gammaslope.constr.V and gammaslope.constr.c.

Like in tam.mml, a multivariate linear regression

\bold{θ}=Y β + \bold{ε}

assuming a multivariate normal distribution on the residuals \bold{ε} can be specified (skillspace="normal").

Alternatively, a log-linear distribution of the skill classes P(θ) (skillspace="discrete") is performed

\log P(θ )=D_{ δ } δ

See Xu and von Davier (2008). The design matrix D_{δ} can be specified in delta.designmatrix. The theta grid \bold{θ} of the skill space can be defined in theta.k.

Value

The same output as in tam.mml and additional entries

delta

Parameter vector δ

gammaslope

Estimated γ item slope parameters

se.gammaslope

Standard errors γ item slope parameters

E

Used design matrix E

Edes

Used design matrix E in compact form

guess

Estimated c guessing parameters

se.guess

Standard errors c guessing parameters

Note

The implementation of the model builds on pieces work of Anton Formann. See http://www.antonformann.at/ for more information.

References

Formann, A. K. (1992). Linear logistic latent class analysis for polytomous data. Journal of the American Statistical Association, 87, 476-486. doi: 10.2307/2290280

Formann, A. K. (2007). (Almost) Equivalence between conditional and mixture maximum likelihood estimates for some models of the Rasch type. In M. von Davier & C. H. Carstensen (Eds.), Multivariate and mixture distribution Rasch models (pp. 177-189). New York: Springer. doi: 10.1007/978-0-387-49839-3_11

Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS. doi: 10.1002/j.2333-8504.2008.tb02113.x

See Also

See also tam.mml.

See the CDM::slca function in the CDM package for a similar method.

logLik.tam, anova.tam

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
## Not run: 
#############################################################################
# EXAMPLE 1: Dichotomous data | data.sim.rasch
#############################################################################

data(data.sim.rasch)
dat <- data.sim.rasch
# some control arguments
ctl.list <- list(maxiter=100) # increase the number of iterations in applications!

#*** Model 1: Rasch model, normal trait distribution
mod1 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", est.some.slopes=FALSE,
              control=ctl.list)
summary(mod1)

#*** Model 2: Rasch model, discrete trait distribution
#  choose theta grid
theta.k <- seq( -3, 3, len=7 )    # discrete theta grid distribution
# define symmetric trait distribution
delta.designmatrix <- matrix( 0, nrow=7, ncol=4 )
delta.designmatrix[4,1] <- 1
delta.designmatrix[c(3,5),2] <- 1
delta.designmatrix[c(2,6),3] <- 1
delta.designmatrix[c(1,7),4] <- 1
mod2 <- TAM::tam.mml.3pl(resp=dat, skillspace="discrete", est.some.slopes=FALSE,
           theta.k=theta.k, delta.designmatrix=delta.designmatrix, control=ctl.list)
summary(mod2)

#*** Model 3: 2PL model
mod3 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", gammaslope.des="2PL",
       control=ctl.list, est.variance=FALSE )
summary(mod3)

#*** Model 4: 3PL model
# estimate guessing parameters for items 3,7,9 and 12
I <- ncol(dat)
est.guess <- rep(0, I )
# set parameters 9 and 12 equal -> same integers
est.guess[ c(3,7,9,12) ] <- c( 1, 3, 2, 2 )
# starting values guessing parameter
guess <- .2*(est.guess > 0)
# estimate model
mod4 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", gammaslope.des="2PL",
        control=ctl.list, est.guess=est.guess, guess=guess, est.variance=FALSE)
summary(mod4)

#--- specification in tamaan
tammodel <- "
LAVAAN MODEL:
  F1=~ I1__I40
  F1 ~~ 1*F1
  I3 + I7 ?=g1
  I9 + I12 ?=g912 * g1
    "
mod4a <- TAM::tamaan( tammodel, resp=dat, control=list(maxiter=20))
summary(mod4a)

#*** Model 5: 3PL model, add some prior Beta distribution
guess.prior <- matrix( 0, nrow=I, ncol=2 )
guess.prior[ est.guess  > 0, 1] <- 5
guess.prior[ est.guess  > 0, 2] <- 17
mod5 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", gammaslope.des="2PL",
        control=ctl.list, est.guess=est.guess, guess=guess, guess.prior=guess.prior)
summary(mod5)

#--- specification in tamaan
tammodel <- "
LAVAAN MODEL:
  F1=~ I1__I40
  F1 ~~ 1*F1
  I3 + I7 ?=g1
  I9 + I12 ?=g912 * g1
MODEL PRIOR:
  g912 ~ Beta(5,17)
  I3_guess ~ Beta(5,17)
  I7_guess ~ Beta(5,17)
    "
mod5a <- TAM::tamaan( tammodel, resp=dat, control=list(maxiter=20))

#*** Model 6: 2PL model with design matrix for item slopes
I <- 40         # number of items
D <- 1       # dimensions
maxK <- 2    # maximum number of categories
Ngam <- 13   # number of different slope parameters
E <- array( 0, dim=c(I,maxK,D,Ngam) )
# joint slope parameters for items 1 to 10, 11 to 20, 21 to 30
E[ 1:10, 2, 1, 2 ] <- 1
E[ 11:20, 2, 1, 1 ] <- 1
E[ 21:30, 2, 1, 3 ] <- 1
for (ii in 31:40){   E[ii,2,1,ii - 27 ] <- 1 }
# estimate model
mod6 <- TAM::tam.mml.3pl(resp=dat, control=ctl.list,   E=E, est.variance=FALSE  )
summary(mod6)

#*** Model 6b: Truncated normal prior distribution for slope parameters
gammaslope.prior <- matrix( 0, nrow=Ngam, ncol=4 )
gammaslope.prior[,1] <- 2  # mean
gammaslope.prior[,2] <- 10  # standard deviation
gammaslope.prior[,3] <- -Inf # lower bound
gammaslope.prior[ 4:13,3] <- 1.2
gammaslope.prior[,4] <- Inf  # upper bound
# estimate model
mod6b <- TAM::tam.mml.3pl(resp=dat, E=E, est.variance=FALSE,
                gammaslope.prior=gammaslope.prior, control=ctl.list )
summary(mod6b)

#*** Model 7: 2PL model with design matrix of slopes and slope constraints
Ngam <- dim(E)[4]   # number of gamma parameters
# define two constraint equations
gammaslope.constr.V <- matrix( 0, nrow=Ngam, ncol=2 )
gammaslope.constr.c <- rep(0,2)
# set sum of first two xlambda entries to 1.8
gammaslope.constr.V[1:2,1] <- 1
gammaslope.constr.c[1] <- 1.8
# set sum of entries 4, 5 and 6 to 3
gammaslope.constr.V[4:6,2] <- 1
gammaslope.constr.c[2] <- 3
mod7 <- TAM::tam.mml.3pl(resp=dat, control=ctl.list,  E=E,  est.variance=FALSE,
   gammaslope.constr.V=gammaslope.constr.V, gammaslope.constr.c=gammaslope.constr.c)
summary(mod7)

#**** Model 8: Located latent class Rasch model with estimated three skill points

# three classes of theta's are estimated
TP <- 3
theta.k <- diag(TP)
# because item difficulties are unrestricted, we define the sum of the estimated
# theta points equal to zero
Ngam <- TP  # estimate three gamma loading parameters which are discrete theta points
E <- array( 0, dim=c(I,2,TP,Ngam) )
E[,2,1,1] <- E[,2,2,2] <- E[,2,3,3] <- 1
gammaslope.constr.V <- matrix( 1, nrow=3, ncol=1 )
gammaslope.constr.c <- c(0)
# initial gamma values
gammaslope <- c( -2, 0, 2 )
# estimate model
mod8 <- TAM::tam.mml.3pl(resp=dat, control=ctl.list,  E=E,  skillspace="discrete",
     theta.k=theta.k, gammaslope=gammaslope, gammaslope.constr.V=gammaslope.constr.V,
     gammaslope.constr.c=gammaslope.constr.c )
summary(mod8)

#*** Model 9: Multidimensional multiple group model
N <- nrow(dat)
I <- ncol(dat)
group <- c( rep(1,N/4), rep(2,N/4), rep(3,N/2) )
Q <- matrix(0,nrow=I,ncol=2)
Q[ 1:(I/2), 1] <- Q[ seq(I/2+1,I), 2] <- 1
# estimate model
mod9 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal", est.some.slopes=FALSE,
               group=group, Q=Q)
summary(mod9)

#############################################################################
# EXAMPLE 2: Polytomous data
#############################################################################

data( data.mg, package="CDM")
dat <- data.mg[1:1000, paste0("I",1:11)]

#*******************************************************
#*** Model 1: 1-dimensional 1PL estimation, normal skill distribution
mod1 <- TAM::tam.mml.3pl(resp=dat, skillspace="normal",
           gammaslope.des="2PL", est.some.slopes=FALSE,  est.variance=TRUE  )
summary(mod1)

#*******************************************************
#*** Model 2: 1-dimensional 2PL estimation, discrete skill distribution
# define skill space
theta.k <- matrix( seq(-5,5,len=21) )
# allow skew skill distribution
delta.designmatrix <- cbind( 1, theta.k, theta.k^2, theta.k^3 )
# fix 13th xsi item parameter to zero
xsi.fixed <- cbind( 13, 0 )
# fix 10th slope paremeter to one
gammaslope.fixed <- cbind( 10, 1 )
# estimate model
mod2 <- TAM::tam.mml.3pl(resp=dat, skillspace="discrete", theta.k=theta.k,
          delta.designmatrix=delta.designmatrix, gammaslope.des="2PL", xsi.fixed=xsi.fixed,
          gammaslope.fixed=gammaslope.fixed)
summary(mod2)

#*******************************************************
#*** Model 3: 2-dimensional 2PL estimation, normal skill distribution

# define loading matrix
Q <- matrix(0,11,2)
Q[1:6,1] <- 1   # items 1 to 6 load on dimension 1
Q[7:11,2] <- 1  # items 7 to 11 load on dimension 2
# estimate model
mod3 <- TAM::tam.mml.3pl(resp=dat, gammaslope.des="2PL", Q=Q )
summary(mod3)

#############################################################################
# EXAMPLE 3: Dichotomous data with guessing
#############################################################################

#*** simulate data
set.seed(9765)
N <- 4000   # number of persons
I <- 20     # number of items
b <- seq( -1.5, 1.5, len=I )
guess <- rep(0, I )
guess.items <- c(6,11,16)
guess[ guess.items ] <- .33
library(sirt)
dat <- sirt::sim.raschtype( stats::rnorm(N), b=b, fixed.c=guess )

#*******************************************************
#*** Model 1: Difficulty + guessing model, i.e. fix slopes to 1
est.guess <- rep(0,I)
est.guess[ guess.items ] <- seq(1, length(guess.items) )
# define prior distribution
guess.prior <- matrix( cbind( 5, 17 ), I, 2, byrow=TRUE )
guess.prior[ ! est.guess, ] <- 0
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, guess=guess, est.guess=est.guess,
           guess.prior=guess.prior, control=ctl.list,est.variance=TRUE,
           est.some.slopes=FALSE )
summary(mod1)

#*******************************************************
#*** Model 2: estimate a joint guessing parameter
est.guess <- rep(0,I)
est.guess[ guess.items ] <- 1
# estimate model
mod2 <- TAM::tam.mml.3pl(resp=dat, guess=guess, est.guess=est.guess,
            guess.prior=guess.prior, control=ctl.list,est.variance=TRUE,
            est.some.slopes=FALSE )
summary(mod2)

#############################################################################
# EXAMPLE 4: Latent class model with two classes
#      See slca Simulated Example 2 in the CDM package
#############################################################################

#*******************************************************
#*** simulate data
set.seed(9876)
I <- 7  # number of items
# simulate response probabilities
a1 <- round( stats::runif(I,0, .4 ),4)
a2 <- round( stats::runif(I, .6, 1 ),4)
N <- 1000   # sample size
# simulate data in two classes of proportions .3 and .7
N1 <- round(.3*N)
dat1 <- 1 * ( matrix(a1,N1,I,byrow=TRUE) > matrix( stats::runif( N1 * I), N1, I ) )
N2 <- round(.7*N)
dat2 <- 1 * ( matrix(a2,N2,I,byrow=TRUE) > matrix( stats::runif( N2 * I), N2, I ) )
dat <- rbind( dat1, dat2 )
colnames(dat) <- paste0("I", 1:I)

#*******************************************************
# estimation using tam.mml.3pl function

# define design matrices
TP <- 2     # two classes
theta.k <- diag(TP)     # there are theta dimensions -> two classes
# The idea is that latent classes refer to two different "dimensions".
# Items load on latent class indicators 1 and 2, see below.
E <- array(0, dim=c(I,2,2,2*I) )
items <- colnames(dat)
dimnames(E)[[4]] <- c(paste0( colnames(dat), "Class", 1),
          paste0( colnames(dat), "Class", 2) )
# items, categories, classes, parameters
# probabilities for correct solution
for (ii in 1:I){
    E[ ii, 2, 1, ii ] <- 1    # probabilities class 1
    E[ ii, 2, 2, ii+I ] <- 1  # probabilities class 2
                    }

# estimation command
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, control=list(maxit=20), skillspace="discrete",
          theta.k=theta.k, notA=TRUE)
summary(mod1)
# compare simulated and estimated data
cbind( mod1$rprobs[,2,1], a2 )  # Simulated class 2
cbind( mod1$rprobs[,2,2], a1 )  # Simulated class 1

#*******************************************************
#** specification with tamaan
tammodel <- "
ANALYSIS:
  TYPE=LCA;
  NCLASSES(2);   # 2 classes
  NSTARTS(5,20); # 5 random starts with 20 iterations
LAVAAN MODEL:
  F=~ I1__I7
    "
mod1b <- TAM::tamaan( tammodel, resp=dat )
summary(mod1b)
# compare with mod1
logLik(mod1)
logLik(mod1b)

#############################################################################
# EXAMPLE 5: Located latent class model, Rasch model
#      See slca Simulated Example 4 in the CDM package
#############################################################################

#*** simulate data
set.seed(487)
I <- 15  # I items
b1 <- seq( -2, 2, len=I)      # item difficulties
N <- 2000    # number of persons
# simulate 4 theta classes
theta0 <- c( -2.5, -1, 0.3, 1.3 )  # skill classes
probs0 <- c( .1, .4, .2, .3 )      # skill class probabilities
TP <- length(theta0)
theta <- theta0[ rep(1:TP, round(probs0*N)  ) ]
library(sirt)
dat <- sirt::sim.raschtype( theta, b1 )
colnames(dat) <- paste0("I",1:I)

#*******************************************************
#*** Model 1: Located latent class model with 4 classes
maxK <- 2
Ngam <- TP
E <- array( 0, dim=c(I, maxK, TP,  Ngam ) )
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- paste0("Cat", 1:(maxK) )
dimnames(E)[[3]] <- paste0("Class", 1:TP)
dimnames(E)[[4]] <- paste0("theta", 1:TP)
# theta design
for (tt in 1:TP){   E[1:I, 2, tt,  tt] <- 1       }
theta.k <- diag(TP)
# set eighth item difficulty to zero
xsi.fixed <- cbind( 8, 0 )
# initial gamma parameter
gammaslope <- seq( -1.5, 1.5, len=TP)
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, xsi.fixed=xsi.fixed,
           control=list(maxiter=100), skillspace="discrete",
           theta.k=theta.k, gammaslope=gammaslope)
summary(mod1)
# compare estimated and simulated theta class locations
cbind( mod1$gammaslope, theta0 )
# compare estimated and simulated latent class proportions
cbind( mod1$pi.k, probs0 )

#----- specification using tamaan
tammodel <- "
ANALYSIS:
  TYPE=LOCLCA;
  NCLASSES(4)
LAVAAN MODEL:
  F=~ I1__I15
  I8 | 0*t1
    "
mod1b <- TAM::tamaan( tammodel, resp=dat )
summary(mod1b)

#############################################################################
# EXAMPLE 6: DINA model with two skills
#      See slca Simulated Example 5 in the CDM package
#############################################################################

#*** simulate data
set.seed(487)
N <- 3000   # number of persons
# define Q-matrix
I <- 9  # 9 items
NS <- 2 # 2 skills
TP <- 4 # number of skill classes
Q <- scan(nlines=3, text=
  "1 0   1 0   1 0
   0 1   0 1   0 1
   1 1   1 1   1 1",
   )
Q <- matrix(Q, I, ncol=NS, byrow=TRUE)
# define skill distribution
alpha0 <- matrix( c(0,0,1,0,0,1,1,1), nrow=4,ncol=2,byrow=TRUE)
prob0 <- c( .2, .4, .1, .3 )
alpha <- alpha0[ rep( 1:TP, prob0*N),]
# define guessing and slipping parameters
guess <- round( stats::runif(I, 0, .4 ), 2 )
slip <- round( stats::runif(I, 0, .3 ), 2 )
# simulate data according to the DINA model
dat <- CDM::sim.din( q.matrix=Q, alpha=alpha, slip=slip, guess=guess )$dat

#*** Model 1: Estimate DINA model
# define E matrix which contains the anti-slipping parameters
maxK <- 2
Ngam <- I
E <- array( 0, dim=c(I, maxK, TP,  Ngam ) )
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- paste0("Cat", 1:(maxK) )
dimnames(E)[[3]] <- c("S00","S10","S01","S11")
dimnames(E)[[4]] <- paste0( "antislip", 1:I )
# define anti-slipping parameters in E
for (ii in 1:I){
        # define latent responses
        latresp <- 1*( alpha0 %*% Q[ii,]==sum(Q[ii,]) )[,1]
        # model slipping parameters
        E[ii, 2, latresp==1, ii ] <- 1
                 }
# skill space definition
theta.k <- diag(TP)
gammaslope <- rep( qlogis( .8 ), I )

# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, control=list(maxiter=100), skillspace="discrete",
          theta.k=theta.k, gammaslope=gammaslope)
summary(mod1)
# compare estimated and simulated latent class proportions
cbind( mod1$pi.k, probs0 )
# compare estimated and simulated guessing parameters
cbind( mod1$rprobs[,2,1], guess )
# compare estimated and simulated slipping parameters
cbind( 1 - mod1$rprobs[,2,4], slip )

#############################################################################
# EXAMPLE 7: Mixed Rasch model with two classes
#      See slca Simulated Example 3 in the CDM package
#############################################################################

#*** simulate data
set.seed(987)
library(sirt)
# simulate two latent classes of Rasch populations
I <- 15  # 6 items
b1 <- seq( -1.5, 1.5, len=I)      # difficulties latent class 1
b2 <- b1    # difficulties latent class 2
b2[ c(4,7, 9, 11, 12, 13) ] <- c(1, -.5, -.5, .33, .33, -.66 )
b2 <- b2 - mean(b2)
N <- 3000    # number of persons
wgt <- .25       # class probability for class 1
# class 1
dat1 <- sirt::sim.raschtype( stats::rnorm( wgt*N ), - b1 )
# class 2
dat2 <- sirt::sim.raschtype( stats::rnorm( (1-wgt)*N, mean=1, sd=1.7), - b2 )
dat <- rbind( dat1, dat2 )

# The idea is that each grid point class x theta is defined as new
# dimension. If we approximate the trait distribution by 7 theta points
# and are interested in estimating 2 latent classes, then we need
# 7*2=14 dimensions.

#*** Model 1: Rasch model
# theta grid
theta.k1 <- seq( -5, 5, len=7 )
TT <- length(theta.k1)
#-- define theta design matrix
theta.k <- diag(TT)
#-- delta designmatrix
delta.designmatrix <- matrix( 0, TT, ncol=3 )
delta.designmatrix[, 1] <- 1
delta.designmatrix[, 2:3] <- cbind( theta.k1, theta.k1^2 )

#-- define loading matrix E
E <- array( 0, dim=c(I,2,TT,I + 1) )  # last parameter is constant 1
for (ii in 1:I){
    E[ ii, 2, 1:TT, ii ] <- -1   # '-b' in '1*theta - b'
    E[ ii, 2, 1:TT, I+1] <- theta.k1  # '1*theta' in '1*theta - b'
                }
# initial gammaslope parameters
par1 <- stats::qlogis( colMeans( dat ) )
gammaslope <- c( par1, 1 )
# sum constraint of zero on item difficulties
gammaslope.constr.V <- matrix( 0, I+1, 1 )
gammaslope.constr.V[ 1:I, 1] <- 1  # Class 1
gammaslope.constr.c <- c(0)
# fixed gammaslope parameter
gammaslope.fixed <- cbind( I+1, 1 )
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat1, E=E, skillspace="discrete",
           theta.k=theta.k, delta.designmatrix=delta.designmatrix,
           gammaslope=gammaslope, gammaslope.constr.V=gammaslope.constr.V,
           gammaslope.constr.c=gammaslope.constr.c, gammaslope.fixed=gammaslope.fixed,
           notA=TRUE, est.variance=FALSE)
summary(mod1)

#*** Model 2: Mixed Rasch model with two latent classes
# theta grid
theta.k1 <- seq( -4, 4, len=7 )
TT <- length(theta.k1)
#-- define theta design matrix
theta.k <- diag(2*TT)   # 2*7=14 classes
#-- delta designmatrix
delta.designmatrix <- matrix( 0, 2*TT, ncol=6 )
# Class 1
delta.designmatrix[1:TT, 1] <- 1
delta.designmatrix[1:TT, 2:3] <- cbind( theta.k1, theta.k1^2 )
# Class 2
delta.designmatrix[TT+1:TT, 4] <- 1
delta.designmatrix[TT+1:TT, 5:6] <- cbind( theta.k1, theta.k1^2 )

#-- define loading matrix E
E <- array( 0, dim=c(I,2,2*TT,2*I + 1) )  # last parameter is constant 1
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- c("Cat0","Cat1")
dimnames(E)[[3]] <- c( paste0("Class1_theta", 1:TT), paste0("Class2_theta", 1:TT) )
dimnames(E)[[4]] <- c( paste0("b_Class1_", colnames(dat)),
       paste0("b_Class2_", colnames(dat)), "One")
for (ii in 1:I){
  # Class 1 item parameters
    E[ ii, 2, 1:TT, ii ] <- -1   # '-b' in '1*theta - b'
    E[ ii, 2, 1:TT, 2*I+1] <- theta.k1  # '1*theta' in '1*theta - b'
  # Class 2 item parameters
    E[ ii, 2, TT + 1:TT, I + ii ] <- -1
    E[ ii, 2, TT + 1:TT, 2*I+1] <- theta.k1
                }
# initial gammaslope parameters
par1 <- qlogis( colMeans( dat ) )
gammaslope <- c( par1, par1 + stats::runif(I, -2,2 ), 1 )
# sum constraint of zero on item difficulties within a class
gammaslope.center.index <- c( rep( 1, I ), rep(2,I), 0 )
gammaslope.center.value <- c(0,0)

# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, skillspace="discrete",
            theta.k=theta.k, delta.designmatrix=delta.designmatrix,
            gammaslope=gammaslope, gammaslope.center.index=gammaslope.center.index,
            gammaslope.center.value=gammaslope.center.value, gammaslope.fixed=gammaslope.fixed,
            notA=TRUE)
summary(mod1)
# latent class proportions
stats::aggregate( mod1$pi.k, list( rep(1:2, each=TT)), sum )
# compare simulated and estimated item parameters
cbind( b1, b2, - mod1$gammaslope[1:I], - mod1$gammaslope[I + 1:I ] )

#--- specification in tamaan
tammodel <- "
ANALYSIS:
  TYPE=MIXTURE;
  NCLASSES(2)
  NSTARTS(5,30)
LAVAAN MODEL:
  F=~ I0001__I0015
ITEM TYPE:
  ALL(Rasch);
    "
mod1b <- TAM::tamaan( tammodel, resp=dat )
summary(mod1b)

#############################################################################
# EXAMPLE 8: 2PL mixture distribution model
#############################################################################

#*** simulate data
set.seed(9187)
library(sirt)
# simulate two latent classes of Rasch populations
I <- 20
b1 <- seq( -1.5, 1.5, len=I)      # difficulties latent class 1
b2 <- b1    # difficulties latent class 2
b2[ c(4,7, 9, 11, 12, 13, 16, 18) ] <- c(1, -.5, -.5, .33, .33, -.66, -1, .3)
# b2 <- scale( b2, scale=FALSE)
b2 <- b2 - mean(b2)
N <- 4000       # number of persons
wgt <- .75       # class probability for class 1
# item slopes
a1 <- rep( 1, I )  # first class
a2 <- rep( c(.5,1.5), I/2 )

# class 1
dat1 <- sirt::sim.raschtype( stats::rnorm( wgt*N ), - b1, fixed.a=a1)
# class 2
dat2 <- sirt::sim.raschtype( stats::rnorm( (1-wgt)*N, mean=1, sd=1.4), - b2, fixed.a=a2)
dat <- rbind( dat1, dat2 )

#*** Model 1: Mixed 2PL model with two latent classes

theta.k1 <- seq( -4, 4, len=7 )
TT <- length(theta.k1)
#-- define theta design matrix
theta.k <- diag(2*TT)   # 2*7=14 classes
#-- delta designmatrix
delta.designmatrix <- matrix( 0, 2*TT, ncol=6 )
# Class 1
delta.designmatrix[1:TT, 1] <- 1
delta.designmatrix[1:TT, 2:3] <- cbind( theta.k1, theta.k1^2 )
# Class 2
delta.designmatrix[TT+1:TT, 4] <- 1
delta.designmatrix[TT+1:TT, 5:6] <- cbind( theta.k1, theta.k1^2 )

#-- define loading matrix E
E <- array( 0, dim=c(I,2,2*TT,4*I ) )
dimnames(E)[[1]] <- colnames(dat)
dimnames(E)[[2]] <- c("Cat0","Cat1")
dimnames(E)[[3]] <- c( paste0("Class1_theta", 1:TT), paste0("Class2_theta", 1:TT) )
dimnames(E)[[4]] <- c( paste0("b_Class1_", colnames(dat)),
                       paste0("a_Class1_", colnames(dat)),
                       paste0("b_Class2_", colnames(dat)),
                       paste0("a_Class2_", colnames(dat)) )

for (ii in 1:I){
  # Class 1 item parameters
    E[ ii, 2, 1:TT, ii ] <- -1           # '-b' in 'a*theta - b'
    E[ ii, 2, 1:TT, I + ii] <- theta.k1  # 'a*theta' in 'a*theta - b'
  # Class 2 item parameters
    E[ ii, 2, TT + 1:TT, 2*I + ii ] <- -1
    E[ ii, 2, TT + 1:TT, 3*I + ii ] <- theta.k1
}

# initial gammaslope parameters
par1 <- scale( - stats::qlogis( colMeans( dat ) ), scale=FALSE )
gammaslope <- c( par1, rep(1,I),  scale( par1 + runif(I, - 1.4, 1.4 ),
        scale=FALSE), stats::runif( I,.6,1.4) )

# constraint matrix
gammaslope.constr.V <- matrix( 0, 4*I, 4 )
# sum of item intercepts equals zero
gammaslope.constr.V[ 1:I, 1] <- 1        # Class 1 (b)
gammaslope.constr.V[ 2*I + 1:I, 2] <- 1  # Class 2 (b)
# sum of item slopes equals number of items -> mean slope of 1
gammaslope.constr.V[ I + 1:I, 3] <- 1    # Class 1 (a)
gammaslope.constr.V[ 3*I + 1:I, 4] <- 1  # Class 2 (a)
gammaslope.constr.c <- c(0,0,I,I)

# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, control=list(maxiter=80), skillspace="discrete",
      theta.k=theta.k, delta.designmatrix=delta.designmatrix,
      gammaslope=gammaslope, gammaslope.constr.V=gammaslope.constr.V,
      gammaslope.constr.c=gammaslope.constr.c,  gammaslope.fixed=gammaslope.fixed,
      notA=TRUE)

# estimated item parameters
mod1$gammaslope
# summary
summary(mod1)
# latent class proportions
round( stats::aggregate( mod1$pi.k, list( rep(1:2, each=TT)), sum ), 3 )
# compare simulated and estimated item intercepts
int <- cbind( b1*a1, b2 * a2, - mod1$gammaslope[1:I], - mod1$gammaslope[2*I + 1:I ] )
round( int, 3 )
# simulated and estimated item slopes
slo <- cbind( a1, a2,  mod1$gammaslope[I+1:I], mod1$gammaslope[3*I + 1:I ] )
round(slo,3)

#--- specification in tamaan
tammodel <- "
ANALYSIS:
  TYPE=MIXTURE;
  NCLASSES(2)
  NSTARTS(10,25)
LAVAAN MODEL:
  F=~ I0001__I0020
    "
mod1t <- TAM::tamaan( tammodel, resp=dat )
summary(mod1t)

#############################################################################
# EXAMPLE 9: Toy example: Exact representation of an item by a factor
#############################################################################

data(data.gpcm)
dat <- data.gpcm[,1,drop=FALSE ]   # choose first item
# some descriptives
( t1 <- table(dat) )

# The idea is that we define an IRT model with one latent variable
# which extactly corresponds to the manifest item.

I <- 1    # 1 item
K <- 4    # 4 categories
TP <- 4   # 4 discrete theta points

# define skill space
theta.k <- diag(TP)
# define loading matrix E
E <- array( -99, dim=c(I,K,TP,1 ) )
for (vv in 1:K){
    E[ 1, vv, vv, 1 ] <- 9
                }
# estimate model
mod1 <- TAM::tam.mml.3pl(resp=dat, E=E, skillspace="discrete",
         theta.k=theta.k,  notA=TRUE)
summary(mod1)
# -> the latent distribution corresponds to the manifest distribution, because ...
round( mod1$pi.k, 3 )
round( t1 / sum(t1), 3 )

#############################################################################
# EXAMPLE 10: Some fixed item loadings
#############################################################################

data(data.Students,package="CDM")
dat <- data.Students
# select variables
vars <- scan( nlines=1, what="character")
    act1 act2 act3 act4 act5 sc1 sc2 sc3 sc4
dat <- data.Students[, vars  ]

# define loading matrix: two-dimensional model
Q <- matrix( 0, nrow=9, ncol=2 )
Q[1:5,1] <- 1
Q[6:9,2] <- 1
# define some fixed item loadings
Q.fixed <- NA*Q
Q.fixed[ c(1,4), 1] <- .5
Q.fixed[ 6:7, 2 ] <- 1

# estimate model
mod3 <- TAM::tam.mml.3pl( resp=dat, gammaslope.des="2PL", Q=Q, Q.fixed=Q.fixed,
            control=list( maxiter=10, nodes=seq(-4,4,len=10) )  )
summary(mod3)

#############################################################################
# EXAMPLE 11: Mixed response formats - Multiple choice and partial credit items
#############################################################################

data(data.timssAusTwn.scored)
dat <- data.timssAusTwn.scored

# select columns with item responses
dat <- dat[, grep("M0", colnames(dat) ) ]
I <- ncol(dat)   # number of items

# The idea is to start with partial credit modelling
# and then to include the guessing parameters

#*** Model 0: Partial Credit Model
mod0 <- TAM::tam.mml(dat)
summary(mod0)

#*** Model 1 and Model 2: include guessing parameters

# multiple choice items
guess_items <- which( apply( dat, 2, max, na.rm=TRUE )==1 )
# define guessing parameters
guess0 <- rep(0,I)
guess0[ guess_items ] <- .25  # set guessing probability to .25

# define which guessing parameters should be estimated
est.guess1 <- rep(0,I)  # all parameters are fixed
est.guess2 <- 1 * ( guess0==.25 )  # joint guessing parameter

# use design matrix from partial credit model
A0 <- mod0$A

#--- Model 1: fixed guessing parameters of .25 and item slopes of 1
mod1 <- TAM::tam.mml.3pl( dat, guess=guess0, est.guess=est.guess1,
            A=A0, est.some.slopes=FALSE, control=list(maxiter=50) )
summary(mod1)

#--- Model 2: estimate joint guessing parameters and item slopes of 1
mod2 <- TAM::tam.mml.3pl( dat, guess=guess0, est.guess=est.guess2,
            A=A0, est.some.slopes=FALSE, control=list(maxiter=50) )
summary(mod2)

# model comparison
IRT.compareModels(mod0,mod1,mod2)

## End(Not run)

TAM documentation built on June 25, 2021, 5:13 p.m.