knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(knitr)
library(measurementInvariance)

Version 0.3.2

Package overview

measurementInvariance is an R package dedicated to sound measurement invariance (MI) analysis, focusing on issues of establishing partial MI. It subsumes SEM and IRT models by importing from lavaan and mirt respectively.

Imported packages:

Function overview

There are 4 main functions:

All functions deal with unidimensional models only. The typical workflow is intended to be testMI() -> clusterItems() -> either partialMI() or modelAveraging(). However, the functions can also be used independently.

Continuous data (& partial MI)

Here we are taking data from the Holzinger-Swinefort (1939) example on cognitive tests. We will use the "Speed" items with gender as a grouping variable, for which MI is to be tested.

suppressMessages(library(MBESS))
data(HS)
Data <- HS[, c("t10_addition", "t11_code", "t12_counting_groups_of_dots", # speed items
               "t13_straight_and_curved_capitals", 
               "t20_deduction", "t21_numerical_puzzles", "t22_problem_reasoning", # math items
               "t23_series_completion", "t24_woody_mccall", "sex")]
colnames(Data) <- sub("_.*", "", colnames(Data)) # shorten variable names for convenience
str(Data)

Assume, we are interested in establishing strong MI for the purpose of group comparison. The summary function gives an overview over model fit of sequentially tested MI models as well as their comparison.

speed <- c("t10", "t11", "t12", "t13")

testMIspeed <- testMI(items = speed, 
                      group = "sex",
                      data = Data,
                      MIlevel = "strong")
summary(testMIspeed)

We identify (borderline) issues with weak and definite problems with strong MI (for cut-offs see Chen, 2007).

We proceed to identify item clusters for which MI holds. Most importantly, we need to tell the clusterItem() function, for which parameter types there were issues with MI. In this case it's both loadings and difficulties (aka intercepts).

First, item clustering is done by setting a threshold in loading difference that is not to be surpassed by the items of a specific cluster. The smaller the threshold, the more homogeneous the items of a cluster become when compared across groups.

clusterItemsSpeed <- clusterItems(res_testMI = testMIspeed,
                                  clusterWhat = c("loadings", "difficulties"),
                                  method = "threshold",
                                  loadThreshold = 0.3,
                                  intThreshold = 0.5)
summary(clusterItemsSpeed, 
        order = "clusters")

Three clusters are found. In order to better understand the clustering process, which takes two steps by first clustering item loadings and then item difficulties, a plot can be requested.

plotCluster(clusterItemsSpeed,
            showLegend = TRUE)

Alternatively to the threshold criterion, a significance test can be used as stopping criterion in the clustering.

clusterItemsSpeed_p <- clusterItems(res_testMI = testMIspeed,
                                    clusterWhat = c("loadings", "difficulties"),
                                    method = "sigTest",
                                    alphaValue = 0.05)
summary(clusterItemsSpeed_p, 
        order = "clusters")

Here, four clusters are found. The items differ strong enough that they do not pair up at all according to the significance test.

After inspection of the items we might decide for going with cluster 2 of the threshold analysis as anchor items. Hence we call the partialMI() function. It directly displays the model estimates.

partialMI(res_clusterItems = clusterItemsSpeed, 
          anchor = 2) # cluster number

Alternatively, we might have some source other than the clustering for finding anchor candidates. partialMI() also takes item names as anchors and does not only rely on clusterItems(). We can also put in all model information.

partialMI(items = speed, 
          group = "sex",
          data = Data,
          MIlevel = "strong",
          partialMIwhat = c("loadings", "difficulties"), # equivalent to clusterWhat above
          anchor = c("t10", "t11")) # item names

Dichotomous data (& Bayesian model averaging)

The package provides good support for dichotomous data models via categorical SEM models in lavaan and IRT models in mirt. Here we have a second example from the FIMS study testing mathematical ability where a 2PL IRT model is applied. Our goal is to compare the latent means of two countries (1 Australia - 2 Japan). We thus need to establish strong MI.

suppressMessages(library(TAM))
data("data.fims.Aus.Jpn.scored")
# choosing only a subset of items and a subset of the sample to keep a lid on
# the computation times of Bayesian analyses
dataDich <- data.fims.Aus.Jpn.scored[c(1:500, 5801:6300), c(2, 3, 4, 8, 9, 11, 15, 16)] 
str(dataDich)

testMIfims <- testMI(items = colnames(dataDich[1:7]),
                    group = "country",
                    data = dataDich,
                    MIlevel = "strong",
                    itemType = "dichotomous",
                    dichModel = "2PL")
summary(testMIfims)

Contrary to the first example, we find no issues with weak MI, but a clear violation of strong MI.

If we want to have a look at a specific model estimated by testMI(), we can use getModel(). This prints a model summary. Additionally, any method from the core package (in this case: mirt) can be applied to the resulting object (e.g. coef, itemfit). Here, the weak MI model is requested.

resWeak <- getModel(testMIfims,
                    which = "weak")

Applying clusterItems() with a threshold for difficulties (or thresholds in IRT terms), we find three item clusters.

clusterItemsFIMS <- clusterItems(res_testMI = testMIfims,
                                 clusterWhat = "difficulties",
                                 method = "threshold",
                                 intThreshold = 0.6)
summary(clusterItemsFIMS)

Again, the clusters can be plotted. In this case, it is only a single plot as only difficulties were clustered.

plotCluster(clusterItemsFIMS,
            showLegend = TRUE)

The relative DIF values which form the basis of the plot can be printed as well.

printDIF(clusterItemsFIMS)

The three clusters can be subject to Bayesian model averaging which returns an averaged mean difference of the two countries. Here we will assume a completely naive weighting scheme by equal weights for all clusters. The resulting averaged mean difference is shown in a plot which illustrates the vastly different results depending on the chosen anchor set. One cluster yields an inverse result compared to the other two clusters, leveling out a mean difference on average.

bma <- modelAveraging(clusterItemsFIMS,
                      weights = rep(1/3, 3),
                      iter = 40000) # Runs long. Reduce for tests, if needed. 
plotAverage(bma)

A specific partial model from the Bayesian analysis can be accessed by getModel. We can see that it is cluster three whose items yield a negative mean difference.

getModel(bma, which = 3) # e.g. for cluster 3 as anchor

Other weights can be applied fast to an already estimated modelAveraging object:

bma2 <- modelAveraging(res_modelAveraging = bma,
                       weights = c(0.45, 0.45, 0.1))
plotAverage(bma2)


Dani-Schulze/measurementInvariance documentation built on Jan. 28, 2022, 1:56 a.m.