knitr::opts_chunk$set(echo = TRUE)

Introduction {#intro}

Latent class analysis (LCA) is the well-known mixture model for dividing a population into mutually exclusive and exhaustive subgroups based on their responses to observed categorical variables. LCA assumes that the population is made up of different unobserved classes and that observations within a class have similar response patterns while there are differences across classes. In LCA, the latent class variable exhaustively explains the association between categorical variables; thus, the number of latent classes tends to increase as the number of categorical variables increases.

When there are several repeatedly measured variables in data, an ordinary LCA may identify a large number of classes for suitable fit, making it challenging to label each latent class. Instead of increasing the number of classes, we can use a stage-sequential dynamic class variable to describe qualitative shifts in latent features. Latent transition analysis (LTA) is a representative stage-sequential latent class model, where the qualitative changes in latent classes are captured using markov model, that is, the relationships between latent status (i.e., latent class membership of each point in time) are described by transition probability matrices \citep{lta}. The transition probabilities are conditional probabilities of latent status membership given former membership in latent status.

Instead of estimating transition probabilities of latent status at each time point, we can summarize the stage-sequential patterns of latent status by deriving a higher-level latent class variable by means of LCA. Latent class profile analysis (LCPA) is a novel approach for stage-sequential latent class model in which the associations among latent statuses are not limited to the order of markov model. In LCPA, latent status memberships identified at each time point are considered as observed categorical variables, and the standard LCA is fitted with these variables to derive a higher-level latent class variable, which is referred as class profile \citep{lcpa}.

When different attributes of an individual are represented by multiple latent class variables, it may be necessary to investigate the relationship between these latent class variables. Suppose, for instance, there are three latent class variables that indicate adolescents' cigarette smoking, alcohol drinking, and drug-use behaviors, respectively. A higher-level latent class variable (i.e., joint class) can be used to describe the overall substance-use behaviors based on how the memberships of these latent class variables fit together. \citet{jlca} used this idea to investigate associations between violent and drug-use behaviors and referred to the model as joint latent class analysis (JLCA). They applied the LCPA estimation procedure without equality constraints on the latent status construct over time to obtain JLCA estimates. In fact, LCPA is analogous to JLCA when there are different latent class variables over time.

There are numerous additional hierarchical LCA variants. For example, \citet{lcamlg} proposed an LCA with multiple latent groups to describe relationship between adolescents drug use and violent behavior. They used the joint class variable identified by three latent class variables (cigarette smoking, alcohol drinking, and other drug use) as a latent group for LCA of violent behavior in adolescents. \citet{mlcpa} proposed multivariate latent class profile analysis (MLCPA), in which there are several latent class variables at each time point in LCPA. \citet{jlcpa} proposed joint latent transition analysis (JLTA) and joint latent class profile analysis (JLCPA) to implement JLCA on longitudinal data. These models identify the joint class variable at each time point and examine its stage-sequential pattern using LTA or LCPA, respectively.

All these LCA variants share a similar hierarchical structure based on the use of transition probability matrices to link different latent variables. In LTA, the transition probability matrices link one latent status to another over two subsequent time points. Therefore, two-time LTA is comparable to LCA when a latent group variable is included. In other words, an LCA with a latent group variable can be used to implement an LTA with two time points when the latent group variable in the LCA is measured by the same categorical variables as those for the latent class variable, but at different time points. In LCPA and JLCA, however, the transition probability matrices link a higher-level latent class variable (e.g., class profile or joint class) to a lower-level latent class variable. However, no research has yet been conducted that combines all of these hierarchical LCA models into one synthesized framework. This paper proposes categorical latent variable tree (CATLVT), which unified the aforementioned models into a single framework, as well as its maximum-likelihood estimation procedure using upward-downward algorithm.

about package

Existing Models

Latent class analysis

Suppose that there are $M$ categorical variables which are mutually dependent.

$$ P(\mathbf{Y} = \mathbf{y}i) = \sum{k = 1}^{K} P(L_i = k) \times \prod_{m = 1}^{M} P(Y_m = y_{im} | L_i = k) $$

Latent transition analysis

Suppose that there is a dynamic latent variable on $t$ time points. At each time point, the dynamic latent variable is measured by $M$ manifest items.

$$ P(\mathbf{Y} = \mathbf{y}i) = \sum{\mathbf{s}} P(S_1 = s_1) \times \prod_{t = 2}^{T} P(S_t = s_t | S_{t-1} = s_{t - 1}) \times \prod_{t = 1}^{T} \prod_{m = 1}^{M} P(Y_{tm} = y_{tm} | S_t = s_t) $$

Joint latent class analysis

There can be several latent variables representing different latent feature, for example, there are latent variables about several substance use behaviors such as alcohol, tobacco and marijuana. It can be interesting to figure out the latent variable binding those latent variables to comprehend overall substance use behavior.

$$ P(\mathbf{Y} = \mathbf{y}) = \sum_{s} \sum_{\mathbf{c}} P(U = s) \times \prod_{j = 1}^{J} \left{ P(L_j = c_j | U = s) \times \prod_{m_j = 1}^{M_j} P(Y_{m_j} = y_{m_j} | L_j = c_j) \right} $$

Categorical Latent Variable Tree (CATLVT) {#body}

Generalization of latent class dependency

Let there be $J$ latent categorical variables, $S_j$, where $j = 1, \dots, J$ and some variables are measured directly from the observed variables. Denote the set of indices of the directly observed latent variables as $E$.

$$ \mathbf{S} = [S_1, S_2, \dots, S_J] \quad\text{and}\quad E = {j | S_j \text{ is measured by observed variables.}} $$

Model structure

$$ P(\mathbf{Y} = \mathbf{y}) = \sum_{\mathbf{s}} P(S_{1} = s_1) \times \prod_{j \ne 1} P(S_{j} = s_j | S_{\rho(j)} = s_{\rho(j)}) \times \prod_{j \in E} P(Y_{j} = y_{j} | S_{j} = s_{j}) $$

$$ \pi_k = \textsf{Pr}\left( S_1 = k \right), \quad \tau_{k|l}^{(j)} = \textsf{Pr}\left( S_{j} = k \mid S_{\rho(j)} = l \right), \quad \text{and} \quad \rho_{mr|k} = \textsf{Pr}\left( Y_{m} = r | S = k \right) $$

$$ P(\mathbf{Y} = \mathbf{y}) = \sum_{\mathbf{s}} \pi_{s_1} \times \prod_{j \ne 1} \tau_{s_{j}|s_{\rho(j)}}^{(j)} \times \prod_{j} \xi_{j}(s_{j}) $$ $$ \text{where} \quad \xi_{j}(k) = \begin{cases} \prod_{m_{j} = 1}^{M_{j}} \prod_{r = 1}^{R_{m_{j}}} \rho_{m_{j}r|k}^{I(y_{m_{j}} = r)} & j \in E \ 1 & j \notin E \end{cases} $$

Introducing covariates

There can be covariates

Parameter Estimation

EM algorithm

As typical latent variable models,

Expectation-step (E-step)

Maximization-step (M-step)

Upward-downward algorithm

The fully-joint posterior probabilities for latent variables have large dimension, however maximization-step needs just marginal version of posterior probabilities. To derive marginal posterior probabilities without calculating fully-joint posterior, we adopt upward-downward algorithm.

First, upward probabilities are conditional probabilities for data below the latent node given the latent class membership for that node.

$$ \begin{aligned} \lambda_{j}(k) &= \textsf{Pr}\left( \bar{\mathbf{Y}}{j} \middle| S{j} = k \right) = \xi_{j}(k) \ \lambda_{j, \rho(j)}(l) &= \textsf{Pr}\left( \bar{\mathbf{Y}}{j} \middle| S{\rho(j)} = l \right) \ &= \sum_k \textsf{Pr}\left( \bar{\mathbf{Y}}{j}, S{j} = k \middle| S_{\rho(j)} = l \right) \ &= \sum_k \textsf{Pr}\left( S_{j} = k \middle| S_{\rho(j)} = l \right) \textsf{Pr}\left( \bar{\mathbf{Y}}{j} \middle| S{j} = k \right) \ &= \sum_k \tau_{k | l}^{(j)} \lambda_{j}(k) \ \lambda_{\rho(j)}(k) &= \textsf{Pr}\left( \bar{\mathbf{Y}}{\rho(j)} \middle| S{\rho(j)} = k \right) \ &= \textsf{Pr}\left( \bar{\mathbf{Y}}{\mathbf{c}(\rho(j))} \middle| S{\rho(j)} = k \right) \textsf{Pr}\left( \mathbf{Y}{\rho(j)} \middle| S{\rho(j)} = k \right) \ &= \left[ \prod_{i \in \mathbf{c}(\rho(j))} \textsf{Pr}\left( \bar{\mathbf{Y}}{i} \middle| S{\rho(j)} = k \right) \right] \textsf{Pr}\left( \mathbf{Y}{\rho(j)} \middle| S{\rho(j)} = k \right) \ &= \left[ \prod_{i \in \mathbf{c}(\rho(j))} \lambda_{i, \rho(j)}(k) \right] \xi_{\rho(j)}(k) \end{aligned} $$

The downward probabilities are joint probabilities for whole data except below the node and latent class membership for that node.

$$ \begin{aligned} \alpha_{j}(k) &= \textsf{Pr}\left( \bar{\mathbf{Y}}{1 \backslash j} = \bar{\mathbf{y}}{1 \backslash j}, S_j = k \right) \ &= \sum_{l} \textsf{Pr}\left( \bar{\mathbf{Y}}{1 \backslash j} = \bar{\mathbf{y}}{1 \backslash j}, S_j = k, S_{\rho(j)} = l \right) \ &= \sum_{l}\frac{\tau_{k|l}^{(j)} \alpha_{\rho(j)}(l) \lambda_{\rho(j)}(l)}{\lambda_{j, \rho(j)}(l)} \end{aligned} $$

After upward and downward recursions are done, each latent node contains both probabilities. Because the product of upward and downward probabilities is the joint probability of whole data and latent class membership of specific node, we can calculate likelihood by summing out latent class membership from the joint probability at every latent node. Since the downward probability of root node is trivially root prevalence, it is recommended to calculate likelihood at root node.

$$ \begin{aligned} f(\mathbf{y}) &= \textsf{Pr}\left( \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right) \ &= \sum_k \textsf{Pr}\left( \bar{\mathbf{Y}} = \bar{\mathbf{y}}, S_j = k \right) \ &= \sum_k \textsf{Pr}\left( \bar{\mathbf{Y}}{j} = \bar{\mathbf{y}}{j} \middle| S_j = k \right) \textsf{Pr}\left( \bar{\mathbf{Y}}{1 \backslash j} = \bar{\mathbf{y}}{1 \backslash j}, S_j = k \right) \ &= \sum_k \alpha_{j}(k) \lambda_{j}(k), \quad \forall\ j \in {1, \dots, p}. \end{aligned} $$

The posterior probabilities for latent class membership at each latent node are trivially the products of upward and downward probabilities divided by likelihood.

$$ \begin{aligned} \theta_{j}(k) &= \textsf{Pr}\left( S_j = k | \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right) \ &= \frac{ \textsf{Pr}\left( S_j = k, \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right) }{ \textsf{Pr}\left( \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right) } \ &=\frac{ \textsf{Pr}\left( S_j = k, \bar{\mathbf{Y}}{1\backslash j} = \bar{\mathbf{y}}{1\backslash j} \right) \textsf{Pr}\left( \bar{\mathbf{Y}}{j} = \bar{\mathbf{y}}{j} | S_j = k \right) }{ \textsf{Pr}\left( \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right) } \ &= \frac{ \alpha_{j}(k) \lambda_{j}(k) }{ f(\mathbf{y}) } \end{aligned} $$

Afterwards, we need joint posterior probabilities between parent node and child node to estimate transition probabilities.

$$ \begin{aligned} \theta_{j, \rho(j)}(k,l) &= \textsf{Pr}\left( S_j = k, S_{\rho(j)} = l | \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right) \ &= \frac{\textsf{Pr}\left( S_j = k, S_{\rho(j)} = l, \bar{\mathbf{Y}} = \bar{\mathbf{y}} \right)}{\textsf{Pr}\left(\bar{\mathbf{Y}} = \bar{\mathbf{y}} \right)} \ &= \frac{\lambda_{j}(k) \tau_{k|l}^{(j)} \alpha_{\rho(j)}(l) \lambda_{\rho(j)}(l)}{\lambda_{j, \rho(j)}(l) f(\mathbf{y})} \end{aligned} $$

Three-step approach for covariate effect

Experimental Study with R Package \texttt{catlvm}

Cross-sectional data

Data description

Latent class analysis

Joint latent class analysis

Longitudinal data

Data description

Latent transition analysis

latent class profile analysis

More complex models

Data description

Joint latent transition analysis

Joint latent class profile analysis

Multivariate latent class profile analysis

Conclusion {#conc}



kim0sun/catlvm documentation built on May 8, 2023, 12:55 p.m.