knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This extends our data generation to incorporate two new aspects:

  1. Longitudinal data - two timepoints

  2. Latent regression - two observed groups (e.g., Placebo arm and Treatment arm) are modeled.

The other simulation data conditions we have already seen: we will have two latent classes (K=1), and we will use all 8 item types.

The stratified transition matrix is a transition matrix estimated separately for different groups. This is essentially for type of latent regression. Here we regress the latent classes at timepoint 2 onto the observed Group, which allows us to estimate a stratified transition matrix - this is the output of interest when, for example, we look at the differences in treatment arms (observed groups) in the proportion of subjects moving from the "Sick" latent class to the "Not Sick" latent class.

Specify Details of Data Generation

First let's specify the basics:

library(CLCM)
N <- 2000
number.timepoints <- 2
item.type <- c('Ordinal', 'Nominal', 'Poisson', 'Neg_Binom', 'ZINB', 'ZIP', 'Normal', 'Beta') 
sim.categories.j <- c(4, 4, 30, 30, 30, 30, NA, NA) 
J <- length(item.type)
item.names <- paste0('Item_', 1:J)
Q <- matrix(1, nrow = length(item.type), ncol = 1, dimnames = list(paste0('Item_', 1:length(item.type)), NULL))
K <- ncol(Q) 

Now that those basics have been handled, let's focus on generating posterior distributions with differences across observed groups. This will be recovered later via latent regression.

Subject Posterior Distributions via Latent Regression

number.groups <- 2
dat <- data.frame(
  'USUBJID' = rep(paste0('Subject_', formatC(1:N, width = 4, flag = '0')), length.out= N*number.timepoints),
  'Group' = rep(paste0('Group_', 1:number.groups), length.out = N*number.timepoints),
  'Time' = rep(paste0('Time_', 1:number.timepoints), each = N),
  stringsAsFactors=F)

# Design Matrix
X <- model.matrix( ~ Group*Time, data = dat) 
# Beta
Beta <- matrix(0, nrow = ncol(X), dimnames=list(colnames(X), 'param'))

# Small separation between txa:
Beta[grepl('Group_2:Time', rownames(Beta)), ] <- 1
Beta

# Matrix multiply:
XB <- X %*% Beta
p <- exp(XB)/(1 + exp(XB))
lca <- matrix(runif(n = nrow(p)), nrow = nrow(p), ncol = 1) < p
lca <- lca + 1

post <- matrix(0, nrow = nrow(dat), ncol = 2^K)
post[ cbind(1:nrow(lca), lca) ] <- 1 # True posterior distributions - pass to simulation function

Simulate Item Responses

Now we have generated the posterior distributions. Let's pass those posterior distributions to the simulation function to simulate the item responses.

set.seed(03082021)
sim.dat <- simulate_clcm(N = N, number.timepoints = number.timepoints, 
                         Q = Q, 
                         item.names = item.names, 
                         item.type = item.type, 
                         categories.j = sim.categories.j, 
                         post = post)

Now we have the item responses generated. We also have the Q-matrix, the item type, and the item names. We are almost ready to estimate the model. However, first, we want to merge in the observed group Group variable. Merge the dataframe with the Observed Group membership with the simulated dataframe. You need the Group variable with the item responses to pass to the model estimation function. If you don't, the esimtation routine won't have a Group variable to compute the latent regression with.

dat.cov <- merge(x = dat, y = sim.dat$dat, by = c('USUBJID', 'Time'))

Estimate CLCM

We need to specify our latent regression model. Here we specify a regression of the latent classes on the Group variable at timepoint 2 only. In other words, we specify that we want to model the transition matrix stratified on the Group variable.

mod <- clcm(dat = dat.cov, 
            item.type = item.type, 
            item.names = item.names, 
            Q = Q, max.diff = 0.001, 
            lat.reg =  list('Time_1' = NULL, 
                            'Time_2' = 'Group'))

Transition Matrix

After estimating the model, check the latent regression parameters and the transition matrix:

Beta # Generating parameter
mod$lat.reg.param  # estimate
transition_matrix_clcm(mod = mod, stratification = T, covariate = 'Group')

Classification Accuracy

Finally, check the classification accuracy by comparing the true classifications with the estimates.

lca.hat <- mod$dat$lca 
lca.true <- mod$dat$true_lca      
table(lca.true == lca.hat)
xtabs( ~ lca.true + lca.hat)


CJangelo/CLCM documentation built on May 22, 2022, 9:27 a.m.