fitLCA: Latent class analysis model

Description Usage Arguments Details Value References See Also Examples

Description

Estimation and model selection for latent class analysis and latent class regression model for clustering multivariate categorical data. The best model is automatically selected using BIC.

Usage

1
fitLCA(Y, G = 1:3, X = NULL, ctrlLCA = controlLCA())

Arguments

Y

A dataframe with (response) categorical variables. The categorical variables used to fit the latent class analysis model are converted to factor.

G

An integer vector specifying the numbers of latent classes for which the BIC is to be calculated.

X

A vector or dataframe of concomitant covariates used to predict the class-membership probability. If supplied, the number of observations of X must match the number of Y. If NULL, a model with no predictor variables is estimated.

ctrlLCA

A list of control parameters for the EM algorithm used to fit the model.

Details

The function is a simple wrapper around the function poLCA in the homonymous package and returns less information about the estimated model. The selection of the number of latent classes is performed automatically by means of the Bayesian information criterion (BIC).

When included, covariates are used to predict the probability of class membership. In this case the model is termed as "latent class regression", or, alternatively "concomitant-variable latent class analysis". See poLCA for details.

Value

An object of class 'fitLCA' providing the optimal latent class model selected by BIC.

The ouptut is a list containing:

G

The best number of latent classes according to BIC.

parameters

A list with the following components:

tau

The estimated mixing proportions.

theta

The estimated class conditional probabilities.

coeff

Multinomial logit coefficient estimates on the covariates (when provided). coeff is a matrix with G-1 columns, and one row for each covariate. All logit coefficients are calculated for each class with respect to class 1, assumed as reference by default.

loglik

Value of the maximized Log-likelihood.

BIC

All BIC values computed for the range of values of G provided.

bic

The optimal BIC value.

npar

Number of estimated parameters.

resDf

Number of residual degrees of freedom.

z

A matrix whose [i,g] entry is the probability that observation i belongs to the gth class.

class

Classification corresponding to the maximum a posteriori of matrix z.

iter

Number of iterations.

References

Linzer, D. A. and Lewis, J. B. (2011). poLCA: An R package for polytomous variable latent class analysis. Journal of Statistical Software 42 1-29.

See Also

poLCA

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
data(gss82, package = "poLCA")
maxG(gss82, 1:7)      # not all latent class models can be fitted
fit <- fitLCA(gss82, G = 1:4)

## Not run: 
# diminish tolerance and increase number of replicates
fit2 <- fitLCA(gss82, G = 1:4, ctrlLCA = controlLCA(tol = 1e-06, nrep = 10))

## End(Not run)

# the example with a single covariate as in ?poLCA
data(election, package = "poLCA")
elec <- election[, cbind("MORALG", "CARESG", "KNOWG", "LEADG", "DISHONG", "INTELG",
                         "MORALB", "CARESB", "KNOWB", "LEADB", "DISHONB", "INTELB")]
party <- election$PARTY
fit <- fitLCA(elec, G = 3, X = party)
pidmat <- cbind(1, 1:7)
exb <- exp(pidmat %*% fit$coeff)
matplot(1:7, ( cbind(1, exb)/(1 + rowSums(exb)) ),
        ylim = c(0,1), type = "l",
        main = "Party ID as a predictor of candidate affinity class",
        xlab = "Party ID: strong Democratic (1) to strong Republican (7)",
        ylab = "Probability of latent class membership", 
        lwd = 2 , col = 1)

LCAvarsel documentation built on May 2, 2019, 3:43 a.m.