reglca: Regularized Latent Class Analysis

View source: R/reglca.R

reglcaR Documentation

Regularized Latent Class Analysis

Description

Estimates the regularized latent class model for dichotomous responses based on regularization methods (Chen, Liu, Xu, & Ying, 2015; Chen, Li, Liu, & Ying, 2017). The SCAD and MCP penalty functions are available.

Usage

reglca(dat, nclasses, weights=NULL, group=NULL, regular_type="scad",
   regular_lam=0, sd_noise_init=1, item_probs_init=NULL, class_probs_init=NULL,
   random_starts=1, random_iter=20, conv=1e-05, h=1e-04, mstep_iter=10,
   maxit=1000, verbose=TRUE, prob_min=.0001)

## S3 method for class 'reglca'
summary(object, digits=4, file=NULL,  ...)

Arguments

dat

Matrix with dichotomous item responses. NAs are allowed.

nclasses

Number of classes

weights

Optional vector of sampling weights

group

Optional vector for grouping variable

regular_type

Regularization type. Can be scad or mcp. See gdina for more information.

regular_lam

Regularization parameter λ

sd_noise_init

Standard deviation for amount of noise in generating random starting values

item_probs_init

Optional matrix of initial item response probabilities

class_probs_init

Optional vector of class probabilities

random_starts

Number of random starts

random_iter

Number of initial iterations for random starts

conv

Convergence criterion

h

Numerical differentiation parameter

mstep_iter

Number of iterations in the M-step

maxit

Maximum number of iterations

verbose

Logical indicating whether convergence progress should be displayed

prob_min

Lower bound for probabilities in estimation

object

A required object of class gdina, obtained from a call to the function gdina.

digits

Number of digits after decimal separator to display.

file

Optional file name for a file in which summary should be sinked.

...

Further arguments to be passed.

Details

The regularized latent class model for dichotomous item responses assumes C latent classes. The item response probabilities P(X_i=1|c)=p_{ic} are estimated in such a way such that the number of different p_{ic} values per item is minimized. This approach eases interpretability and enables to recover the structure of a true (but unknown) cognitive diagnostic model.

Value

A list containing following elements (selection):

item_probs

Item response probabilities

class_probs

Latent class probabilities

p.aj.xi

Individual posterior

p.xi.aj

Individual likelihood

loglike

Log-likelihood value

Npars

Number of estimated parameters

Nskillpar

Number of skill class parameters

G

Number of groups

n.ik

Expected counts

Nipar

Number of item parameters

n_reg

Number of regularized parameters

n_reg_item

Number of regularized parameters per item

item

Data frame with item parameters

pjk

Item response probabilities (in an array)

N

Number of persons

I

Number of items

References

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110, 850-866.

Chen, Y., Li, X., Liu, J., & Ying, Z. (2017). Regularized latent class analysis with application in cognitive diagnosis. Psychometrika, 82, 660-692.

See Also

See also the gdina and slca functions for regularized estimation.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Estimating a regularized LCA for DINA data
#############################################################################

#---- simulate data
I <- 12  # number of items
# define Q-matrix
q.matrix <- matrix(0,I,2)
q.matrix[ 1:(I/3), 1 ] <- 1
q.matrix[ I/3 + 1:(I/3), 2 ] <- 1
q.matrix[ 2*I/3 + 1:(I/3), c(1,2) ] <- 1
N <- 1000  # number of persons
guess <- rep(seq(.1,.3,length=I/3), 3)
slip <- .1
rho <- 0.3  # skill correlation
set.seed(987)
dat <- CDM::sim.din( N=N, q.matrix=q.matrix, guess=guess, slip=slip,
           mean=0*c( .2, -.2 ), Sigma=matrix( c( 1, rho,rho,1), 2, 2 ) )
dat <- dat$dat

#--- Model 1: Four latent classes without regularization
mod1 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0, random_starts=3,
               random_iter=10, conv=1E-4)
summary(mod1)

#--- Model 2: Four latent classes with regularization and lambda=.08
mod2 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.08, regular_type="scad",
               random_starts=3, random_iter=10, conv=1E-4)
summary(mod2)

#--- Model 3: Four latent classes with regularization and lambda=.05 with warm start

# "warm start" -> use initial parameters from fitted model with higher lambda value
item_probs_init <- mod2$item_probs
class_probs_init <- mod2$class_probs
mod3 <- CDM::reglca(dat=dat, nclasses=4, regular_lam=0.05, regular_type="scad",
               item_probs_init=item_probs_init, class_probs_init=class_probs_init,
               random_starts=3, random_iter=10, conv=1E-4)

## End(Not run)

CDM documentation built on Aug. 25, 2022, 5:08 p.m.