knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
First step is to set-up the data we will be generating. Here 1,000 subjects from 4 different latent classes responded to 16 items. Data was only collected once.
library(CLCM) library(nnet) N <- 1000 number.timepoints <- 1 item.type <- rep( c('Ordinal', 'Nominal', 'Poisson', 'Neg_Binom', 'ZINB', 'ZIP', 'Normal', 'Beta'), 2) sim.categories.j <- rep( c(4, 4, 30, 30, 30, 30, NA, NA) , 2) J <- length(item.type) item.names <- paste0('Item_', 1:J) K= 2 Q <- matrix(c(rep(c(1,0), J/2), rep(c(0, 1), J/2)), nrow = J, ncol = K, byrow = T)
Generate the subject posteriors via (multinomial) latent regression. The latent class assignments will differ across the two observed groups (Group
).
number.groups <- 2 dat <- data.frame( 'USUBJID' = rep(paste0('Subject_', formatC(1:N, width = 4, flag = '0')), length.out= N*number.timepoints), 'Group' = rep(paste0('Group_', 1:number.groups), length.out = N*number.timepoints), 'Time' = rep(paste0('Time_', 1:number.timepoints), each = N), stringsAsFactors=F) # Design Matrix XX <- model.matrix( ~ Group, data = dat) # Beta Beta <- matrix(0, nrow = ncol(XX), ncol = 2^K - 1, dimnames=list(colnames(XX), rep('LC', 2^K -1))) Beta[1, ] <- c(0.2, 0.8, 0.4) Beta[2, ] <- c(0.2, -1.0, -1.6) Beta # Matrix multiply: XB <- XX %*% Beta p <- exp(XB)/(1 + apply(exp(XB), 1, sum)) p <- cbind(1 - rowSums(p), p) lca <- vector() for(i in 1:nrow(p)){ lca <- c(lca, sample(x = c(1:(2^K)), size = 1, prob = p[i, ]) ) } #end loop to sample lca
At this point we have used the observed group membership (Group
) to generate the (true) posterior distributions/latent class assignments for both timepoints. Let's check the latent class assignments vs Observed group assignment. Again, we haven't yet moved into latent class models yet.
# Check the LCA prop.table(table(lca)) # Check LCA across observed Groups: prop.table(xtabs( ~ lca + dat$Group), margin = 1)
Before we move on with the simulation, let's just do a quick check on whether a simple multinomial logistic regression can recover this. Regress latent class assignment onto observed groups:
mod <- nnet:::multinom(as.factor(lca) ~ dat$Group) coef(mod) t(Beta)
Recovery seems fine. Let's create the true posterior distributions and pass to the simulation function.
post <- matrix(0, nrow = nrow(dat), ncol = 2^K) post[ cbind(1:nrow(dat), lca) ] <- 1 set.seed(03062021) X <- simulate_clcm(N = N, number.timepoints = number.timepoints, Q = Q, item.names = item.names, item.type = item.type, categories.j = sim.categories.j, post = post)
Merge the dataframe with the Observed Group membership with the simulated dataframe. You need the Group
variable with the item responses to pass to the model estimation function. If you don't, the esimtation routine won't have a Group
variable to compute the multinomial latent regression with.
dat.cov <- merge(x = dat, y = X$dat, by = c('USUBJID', 'Time'))
Specify the latent regression formula - regress latent classes onto Group
variable in model estimation by passing lat.reg = list('Time_1' = 'Group')
. Note: if we had additional covariates, we would pass the corresponding regression formula, e.g. 'Group + Sex + Age'
. This will be passed to as.formula
and then passed to a function that fits a regression.
mod <- clcm(dat = dat.cov, item.type = item.type, item.names = item.names, Q = Q, max.diff = 0.001, lat.reg = list('Time_1' = 'Group') )
Evaluate the Latent Regression Parameters:
Beta # Generating parameter mod$lat.reg.param # estimate
Next, compare true classification (latent class assignment, lca) with estimates:
lca.hat <- mod$dat$lca lca.true <- apply( dat.cov[ , grep('true_post', colnames(dat.cov)) ], 1, which.max) xtabs( ~ lca.hat + lca.true) table(lca.true == lca.hat) prop.table(table(lca.true == lca.hat))
Pretty, preeettty, preeeeeetttttyyy good.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.