knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
The lcda package provides latent class discriminant analysis methods for
categorical predictors. The main functions are:
lcda() for class-specific latent class models.cclcda() for common-components latent class models.cclcda2() for common-components models with class-conditional mixing weights.All manifest variables and class labels must be integer-coded and start at 1.
The methods in lcda implement local discrimination for discrete variables using
latent class analysis (LCA). The key idea is to replace a single class-conditional
distribution with a finite mixture of locally independent components. This lets
each class capture heterogeneity while keeping the model tractable for categorical
data.
Let K be the number of classes, M the number of latent components, D the
number of manifest variables, and R_d the number of outcomes for variable d.
The indicator x_dr equals 1 if variable d takes outcome r and 0 otherwise.
Each class has its own latent class model:
$$ f_k(x) = \sum_{m=1}^{M_k} w_{mk} \prod_{d=1}^D \prod_{r=1}^{R_d} \theta_{mkdr}^{x_{dr}} $$
Classification follows the Bayes decision rule:
$$ \hat{k}(x) = \arg\max_k \pi_k f_k(x) $$
Common-components models share the component distributions across classes, while allowing class-specific mixing weights:
$$ f_k(x) = \sum_{m=1}^{M} w_{mk} \prod_{d=1}^D \prod_{r=1}^{R_d} \theta_{mdr}^{x_{dr}} $$
cclcda() first estimates the shared LCA on the pooled data and then derives
class-conditional weights. cclcda2() estimates weights and response
probabilities jointly in each EM step.
Parameter estimation uses the EM algorithm with random starts (see nrep). Model
selection can be guided by AIC, BIC, the likelihood ratio statistic (Gsq), and the
Pearson chi-square statistic (Chisq). For common-components models, additional
quality measures are provided:
These are reported in the fitted model objects returned by cclcda() and
cclcda2().
library(lcda) data(iris) iris_cat <- within(iris, { Sepal.Length <- as.integer(cut(Sepal.Length, breaks = c(-Inf, 5.1, 5.8, 6.4, Inf))) Sepal.Width <- as.integer(cut(Sepal.Width, breaks = c(-Inf, 2.8, 3.0, 3.3, Inf))) Petal.Length <- as.integer(cut(Petal.Length, breaks = c(-Inf, 1.6, 4.35, 5.1, Inf))) Petal.Width <- as.integer(cut(Petal.Width, breaks = c(-Inf, 0.3, 1.3, 1.8, Inf))) Species3 <- as.integer(Species) }) model <- cclcda2( Species3 ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data = iris_cat, m = 1 ) model$bic
Bücker, M., Szepannek, G., Weihs, C. (2010). Local Classification of Discrete Variables by Latent Class Models. In: Locarek-Junge, H., Weihs, C. (eds) Classification as a Tool for Research. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10745-0_13
Bücker, M. (2008). Lokale Diskrimination diskreter Daten. Diplomarbeit, Fakultaet Statistik, TU Dortmund.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.