Description Usage Arguments Details Value Note References Examples

Estimates latent class and latent class regression models for polytomous outcome variables.

1 2 3 |

`formula` |
A formula expression of the form |

`data` |
A data frame containing variables in |

`nclass` |
The number of latent classes to assume in the model. Setting |

`maxiter` |
The maximum number of iterations through which the estimation algorithm will cycle. |

`graphs` |
Logical, for whether |

`tol` |
A tolerance value for judging when convergence has been reached. When the one-iteration change in the estimated log-likelihood is less than |

`na.rm` |
Logical, for how |

`probs.start` |
A list of matrices of class-conditional response probabilities to be used as the starting values for the estimation algorithm. Each matrix in the list corresponds to one manifest variable, with one row for each latent class, and one column for each outcome. The default is |

`nrep` |
Number of times to estimate the model, using different values of |

`verbose` |
Logical, indicating whether |

`calc.se` |
Logical, indicating whether |

Latent class analysis, also known as latent structure analysis, is a technique for the analysis of clustering among observations in multi-way tables of qualitative/categorical variables. The central idea is to fit a model in which any confounding between the manifest variables can be explained by a single unobserved "latent" categorical variable. `poLCA`

uses the assumption of local independence to estimate a mixture model of latent multi-way tables, the number of which (`nclass`

) is specified by the user. Estimated parameters include the class-conditional response probabilities for each manifest variable, the "mixing" proportions denoting population share of observations corresponding to each latent multi-way table, and coefficients on any class-predictor covariates, if specified in the model.

Model specification: Latent class models have more than one manifest variable, so the response variables are `cbind(dv1,dv2,dv3...)`

where `dv#`

refer to variable names in the data frame. For models with no covariates, the formula is `cbind(dv1,dv2,dv3)~1`

. For models with covariates, replace the `~1`

with the desired function of predictors `iv1,iv2,iv3...`

as, for example, `cbind(dv1,dv2,dv3)~iv1+iv2*iv3`

.

`poLCA`

treats all manifest variables as qualitative/categorical/nominal – NOT as ordinal.

`poLCA`

returns an object of class poLCA; a list containing the following elements:

`y` |
data frame of manifest variables. |

`x` |
data frame of covariates, if specified. |

`N` |
number of cases used in model. |

`Nobs` |
number of fully observed cases (less than or equal to |

`probs` |
estimated class-conditional response probabilities. |

`probs.se` |
standard errors of estimated class-conditional response probabilities, in the same format as |

`P` |
sizes of each latent class; equal to the mixing proportions in the basic latent class model, or the mean of the priors in the latent class regression model. |

`P.se` |
the standard errors of the estimated |

`posterior` |
matrix of posterior class membership probabilities; also see function |

`predclass` |
vector of predicted class memberships, by modal assignment. |

`predcell` |
table of observed versus predicted cell counts for cases with no missing values; also see functions |

`llik` |
maximum value of the log-likelihood. |

`numiter` |
number of iterations until reaching convergence. |

`maxiter` |
maximum number of iterations through which the estimation algorithm was set to run. |

`coeff` |
multinomial logit coefficient estimates on covariates (when estimated). |

`coeff.se` |
standard errors of coefficient estimates on covariates (when estimated), in the same format as |

`coeff.V` |
covariance matrix of coefficient estimates on covariates (when estimated). |

`aic` |
Akaike Information Criterion. |

`bic` |
Bayesian Information Criterion. |

`Gsq` |
Likelihood ratio/deviance statistic. |

`Chisq` |
Pearson Chi-square goodness of fit statistic for fitted vs. observed multiway tables. |

`time` |
length of time it took to run the model. |

`npar` |
number of degrees of freedom used by the model (estimated parameters). |

`resid.df` |
number of residual degrees of freedom. |

`attempts` |
a vector containing the maximum log-likelihood values found in each of the |

`eflag` |
Logical, error flag. |

`probs.start` |
A list of matrices containing the class-conditional response probabilities used as starting values in the estimation algorithm. If the algorithm needed to restart (see |

`probs.start.ok` |
Logical. |

`call` |
function call to |

`poLCA`

uses EM and Newton-Raphson algorithms to maximize the latent class model log-likelihood function. Depending on the starting parameters, this algorithm may only locate a local, rather than global, maximum. This becomes more and more of a problem as `nclass`

increases. It is therefore highly advisable to run `poLCA`

multiple times until you are relatively certain that you have located the global maximum log-likelihood. As long as `probs.start=NULL`

, each function call will use different (random) initial starting parameters. Alternatively, setting `nrep`

to a value greater than one enables the user to estimate the latent class model multiple times with a single call to `poLCA`

, thus conducting the search for the global maximizer automatically.

The term "Latent class regression" (LCR) can have two meanings. In this package, LCR models refer to latent class models in which the probability of class membership is predicted by one or more covariates. However, in other contexts, LCR is also used to refer to regression models in which the manifest variable is partitioned into some specified number of latent classes as part of estimating the regression model. It is a way to simultaneously fit more than one regression to the data when the latent data partition is unknown. The `flexmix`

function in package flexmix will estimate this other type of LCR model. Because of these terminology issues, the LCR models this package estimates are sometimes termed "latent class models with covariates" or "concomitant-variable latent class analysis," both of which are accurate descriptions of this model.

A more detailed user's manual is available online at http://userwww.service.emory.edu/~dlinzer/poLCA.

Agresti, Alan. 2002. *Categorical Data Analysis, second edition*. Hoboken: John Wiley \& Sons.

Bandeen-Roche, Karen, Diana L. Miglioretti, Scott L. Zeger, and Paul J. Rathouz. 1997. "Latent Variable Regression for Multiple Discrete Outcomes." *Journal of the American Statistical Association*. 92(440): 1375-1386.

Hagenaars, Jacques A. and Allan L. McCutcheon, eds. 2002. *Applied Latent Class Analysis*. Cambridge: Cambridge University Press.

McLachlan, Geoffrey J. and Thriyambakam Krishnan. 1997. *The EM Algorithm and Extensions*. New York: John Wiley \& Sons.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ```
##
## Three models without covariates:
## M0: Loglinear independence model.
## M1: Two-class latent class model.
## M2: Three-class latent class model.
##
data(values)
f <- cbind(A,B,C,D)~1
M0 <- poLCA(f,values,nclass=1) # log-likelihood: -543.6498
M1 <- poLCA(f,values,nclass=2) # log-likelihood: -504.4677
M2 <- poLCA(f,values,nclass=3,maxiter=8000) # log-likelihood: -503.3011
##
## Three-class model with a single covariate.
##
data(election)
f2a <- cbind(MORALG,CARESG,KNOWG,LEADG,DISHONG,INTELG,
MORALB,CARESB,KNOWB,LEADB,DISHONB,INTELB)~PARTY
nes2a <- poLCA(f2a,election,nclass=3,nrep=5) # log-likelihood: -16222.32
pidmat <- cbind(1,c(1:7))
exb <- exp(pidmat %*% nes2a$coeff)
matplot(c(1:7),(cbind(1,exb)/(1+rowSums(exb))),ylim=c(0,1),type="l",
main="Party ID as a predictor of candidate affinity class",
xlab="Party ID: strong Democratic (1) to strong Republican (7)",
ylab="Probability of latent class membership",lwd=2,col=1)
text(5.9,0.35,"Other")
text(5.4,0.7,"Bush affinity")
text(1.8,0.6,"Gore affinity")
``` |

```
Loading required package: scatterplot3d
Loading required package: MASS
Conditional item response (column) probabilities,
by outcome variable, for each class (row)
$A
Pr(1) Pr(2)
class 1: 0.2083 0.7917
$B
Pr(1) Pr(2)
class 1: 0.5 0.5
$C
Pr(1) Pr(2)
class 1: 0.4861 0.5139
$D
Pr(1) Pr(2)
class 1: 0.6898 0.3102
Estimated class population shares
1
Predicted class memberships (by modal posterior prob.)
1
=========================================================
Fit for 1 latent classes:
=========================================================
number of observations: 216
number of estimated parameters: 4
residual degrees of freedom: 11
maximum log-likelihood: -543.6498
AIC(1): 1095.3
BIC(1): 1108.801
G^2(1): 81.08423 (Likelihood ratio/deviance statistic)
X^2(1): 104.1071 (Chi-square goodness of fit)
Conditional item response (column) probabilities,
by outcome variable, for each class (row)
$A
Pr(1) Pr(2)
class 1: 0.0068 0.9932
class 2: 0.2864 0.7136
$B
Pr(1) Pr(2)
class 1: 0.0602 0.9398
class 2: 0.6704 0.3296
$C
Pr(1) Pr(2)
class 1: 0.0735 0.9265
class 2: 0.6460 0.3540
$D
Pr(1) Pr(2)
class 1: 0.2309 0.7691
class 2: 0.8676 0.1324
Estimated class population shares
0.2792 0.7208
Predicted class memberships (by modal posterior prob.)
0.3287 0.6713
=========================================================
Fit for 2 latent classes:
=========================================================
number of observations: 216
number of estimated parameters: 9
residual degrees of freedom: 6
maximum log-likelihood: -504.4677
AIC(2): 1026.935
BIC(2): 1057.313
G^2(2): 2.719922 (Likelihood ratio/deviance statistic)
X^2(2): 2.719764 (Chi-square goodness of fit)
Conditional item response (column) probabilities,
by outcome variable, for each class (row)
$A
Pr(1) Pr(2)
class 1: 0.0022 0.9978
class 2: 0.1557 0.8443
class 3: 0.5188 0.4812
$B
Pr(1) Pr(2)
class 1: 0.0204 0.9796
class 2: 0.5013 0.4987
class 3: 0.9053 0.0947
$C
Pr(1) Pr(2)
class 1: 0.0000 1.0000
class 2: 0.5522 0.4478
class 3: 0.7310 0.2690
$D
Pr(1) Pr(2)
class 1: 0.0874 0.9126
class 2: 0.7983 0.2017
class 3: 0.9251 0.0749
Estimated class population shares
0.193 0.5804 0.2266
Predicted class memberships (by modal posterior prob.)
0.1944 0.662 0.1435
=========================================================
Fit for 3 latent classes:
=========================================================
number of observations: 216
number of estimated parameters: 14
residual degrees of freedom: 1
maximum log-likelihood: -503.3011
AIC(3): 1034.602
BIC(3): 1081.856
G^2(3): 0.3868563 (Likelihood ratio/deviance statistic)
X^2(3): 0.4225484 (Chi-square goodness of fit)
Model 1: llik = -16222.32 ... best llik = -16222.32
Model 2: llik = -16222.32 ... best llik = -16222.32
Model 3: llik = -16222.32 ... best llik = -16222.32
Model 4: llik = -16222.32 ... best llik = -16222.32
Model 5: llik = -16222.32 ... best llik = -16222.32
Conditional item response (column) probabilities,
by outcome variable, for each class (row)
$MORALG
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.1081 0.3832 0.3038 0.2048
class 2: 0.1373 0.6682 0.1802 0.0143
class 3: 0.6221 0.3350 0.0172 0.0258
$CARESG
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.0356 0.2281 0.4501 0.2861
class 2: 0.0388 0.6138 0.2886 0.0589
class 3: 0.4858 0.4164 0.0534 0.0444
$KNOWG
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.1436 0.5327 0.2556 0.068
class 2: 0.0712 0.8173 0.1025 0.009
class 3: 0.7189 0.2451 0.0040 0.032
$LEADG
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.0278 0.1899 0.5137 0.2685
class 2: 0.0256 0.6280 0.3144 0.0320
class 3: 0.4720 0.4326 0.0643 0.0311
$DISHONG
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.1876 0.3435 0.3078 0.1611
class 2: 0.0210 0.1412 0.5341 0.3037
class 3: 0.0518 0.0538 0.2890 0.6054
$INTELG
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.1704 0.5597 0.2000 0.0698
class 2: 0.0698 0.8159 0.1003 0.0140
class 3: 0.7381 0.2219 0.0089 0.0311
$MORALB
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.4477 0.5135 0.0313 0.0075
class 2: 0.0310 0.6326 0.3003 0.0361
class 3: 0.1610 0.3749 0.3163 0.1478
$CARESB
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.2516 0.6185 0.1097 0.0202
class 2: 0.0047 0.3274 0.5083 0.1597
class 3: 0.0458 0.1490 0.3780 0.4272
$KNOWB
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.3427 0.5913 0.0660 0.0000
class 2: 0.0121 0.6511 0.2920 0.0447
class 3: 0.1300 0.3503 0.2989 0.2209
$LEADB
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.3850 0.5803 0.0287 0.0060
class 2: 0.0301 0.5884 0.3294 0.0521
class 3: 0.0743 0.3035 0.3882 0.2340
$DISHONB
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.0163 0.0680 0.2923 0.6234
class 2: 0.0264 0.1855 0.5931 0.1950
class 3: 0.0914 0.3117 0.3517 0.2451
$INTELB
1 Extremely well 2 Quite well 3 Not too well 4 Not well at all
class 1: 0.3765 0.5878 0.0357 0.0000
class 2: 0.0350 0.6646 0.2616 0.0388
class 3: 0.1877 0.3637 0.2698 0.1787
Estimated class population shares
0.3405 0.3859 0.2736
Predicted class memberships (by modal posterior prob.)
0.3415 0.3815 0.2769
=========================================================
Fit for 3 latent classes:
=========================================================
2 / 1
Coefficient Std. error t value Pr(>|t|)
(Intercept) 3.81813 0.31109 12.274 0
PARTY -0.79327 0.06232 -12.728 0
=========================================================
3 / 1
Coefficient Std. error t value Pr(>|t|)
(Intercept) 4.97967 0.32771 15.195 0
PARTY -1.36762 0.08081 -16.924 0
=========================================================
number of observations: 1300
number of estimated parameters: 112
residual degrees of freedom: 1188
maximum log-likelihood: -16222.32
AIC(3): 32668.65
BIC(3): 33247.7
X^2(3): 34565233714 (Chi-square goodness of fit)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.