# lavTablesFit: Pairwise maximum likelihood fit statistics In yrosseel/lavaan: Latent Variable Analysis

## Description

Three measures of fit for the pairwise maximum likelihood estimation method that are based on likelihood ratios (LR) are defined: C_F, C_M, and C_P. Subscript F signifies a comparison of model-implied proportions of full response patterns with observed sample proportions, subscript M signifies a comparison of model-implied proportions of full response patterns with the proportions implied by the assumption of multivariate normality, and subscript P signifies a comparison of model-implied proportions of pairs of item responses with the observed proportions of pairs of item responses.

## Usage

 1 2 3 lavTablesFitCf(object) lavTablesFitCp(object, alpha = 0.05) lavTablesFitCm(object) 

## Arguments

 object An object of class lavaan. alpha The nominal level of signifiance of global fit.

## Details

#### C_F

The C_F statistic compares the log-likelihood of the model-implied proportions (π_r) with the observed proportions (p_r) of the full multivariate responses patterns:

C_F = 2N∑_{r}p_{r}\ln[p_{r}/\hat{π}_{r}],

which asymptotically has a chi-square distribution with

df_F = m^k - n - 1,

where k denotes the number of items with discrete response scales, m denotes the number of response options, and n denotes the number of parameters to be estimated. Notice that C_F results may be biased because of large numbers of empty cells in the multivariate contingency table.

#### C_M

The C_M statistic is based on the C_F statistic, and compares the proportions implied by the model of interest (Model 1) with proportions implied by the assumption of an underlying multivariate normal distribution (Model 0):

C_M = C_{F1} - C_{F0},

where C_{F0} is C_F for Model 0 and C_{F1} is C_F for Model 1. Statistic C_M has a chi-square distribution with degrees of freedom

df_M = k(k-1)/2 + k(m-1) - n_{1},

where k denotes the number of items with discrete response scales, m denotes the number of response options, and k(k-1)/2 denotes the number of polychoric correlations, k(m-1) denotes the number of thresholds, and n_1 is the number of parameters of the model of interest. Notice that C_M results may be biased because of large numbers of empty cells in the multivariate contingency table. However, bias may cancels out as both Model 1 and Model 0 contain the same pattern of empty responses.

#### C_P

With the C_P statistic we only consider pairs of responses, and compare observed sample proportions (p) with model-implied proportions of pairs of responses(π). For items i and j we obtain a pairwise likelihood ratio test statistic C_{P_{ij}}

C_{P_{ij}}=2N∑_{c_i=1}^m ∑_{c_j=1}^m p_{c_i,c_j}\ln[p_{c_i,c_j}/\hat{π}_{c_i,c_j}],

where m denotes the number of response options and N denotes sample size. The C_P statistic has an asymptotic chi-square distribution with degrees of freedom equal to the information (m^2 -1) minus the number of parameters (2(m-1) thresholds and 1 correlation),

df_P = m^{2} - 2(m - 1) - 2.

As k denotes the number of items, there are k(k-1)/2 possible pairs of items. The C_P statistic should therefore be applied with a Bonferroni adjusted level of significance α^*, with

α^*= α /(k(k-1)/2)),

to keep the family-wise error rate at α. The hypothesis of overall goodness-of-fit is tested at α and rejected as soon as C_P is significant at α^* for at least one pair of items. Notice that with dichotomous items, m = 2, and df_P = 0, so that hypothesis can not be tested.

## References

Barendse, M. T., Ligtvoet, R., Timmerman, M. E., & Oort, F. J. (2016). Structural Equation Modeling of Discrete data: Model Fit after Pairwise Maximum Likelihood. Frontiers in psychology, 7, 1-8.

Joreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347-387.

lavTables, lavaan
  1 2 3 4 5 6 7 8 9 10 11 12 # Data HS9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5", "x6","x7","x8","x9")] HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) ) # Single group example with one latent factor HS.model <- ' trait =~ x1 + x2 + x3 + x4 ' fit <- cfa(HS.model, data=HSbinary[,1:4], ordered=names(HSbinary), estimator="PML") lavTablesFitCm(fit) lavTablesFitCp(fit) lavTablesFitCf(fit)