In M-E-Rademaker/cSEM: Composite-Based Structural Equation Modeling

cSEM: Composite-based SEM

Purpose

Estimate, analyse, test, and study linear, nonlinear, hierarchical and multi-group structural equation models using composite-based approaches and procedures, including estimation techniques such as partial least squares path modeling (PLS-PM) and its derivatives (PLSc, OrdPLSc, robustPLSc), generalized structured component analysis (GSCA), generalized structured component analysis with uniqueness terms (GSCAm), generalized canonical correlation analysis (GCCA), principal component analysis (PCA), factor score regression (FSR) using sum score, regression or Bartlett scores (including bias correction using Croon’s approach), as well as several tests and typical post-estimation procedures (e.g., verify admissibility of the estimates, assess the model fit, test the model fit, compute confidence intervals, compare groups, etc.).

News (2025-05-15):

Release of cSEM version 0.6.1
Release of cSEM Version 0.6.0
Implementation of a plot() function to visualize cSEM models. Thanks to Nguyen.
Enhancement of the predict() function

Installation

The package is available on CRAN:

install.packages("cSEM")

To install the development version, which is recommended, use:

# install.packages("devtools")
devtools::install_github("M-E-Rademaker/cSEM")

Getting started

The best place to get started is the cSEM-website.

Basic usage

The basic usage is illustrated below.

knitr::include_graphics("man/figures/api.png")

Usually, using cSEM is the same 3 step procedure:

Pick a dataset and specify a model using lavaan syntax

Use csem()

Apply one of the post-estimation functions listed below on the resulting object.

Post-Estimation Functions

There are five major post-estimation verbs, three test family functions and three do-family of function:

assess() : assess the model using common quality criteria
infer() : calculate common inferential quantities (e.g., standard errors, confidence intervals)
predict() : predict endogenous indicator values
plot() : Plot the cSEM model
summarize() : summarize the results
verify() : verify admissibility of the estimates

Tests are performed by using the test family of functions. Currently, the following tests are implemented:

testCVPAT() performs a cross-validated predictive ability test
testOMF() : performs a test for overall model fit
testMICOM() : performs a test for composite measurement invariance
testMGD() : performs several tests to assess multi-group differences
testHausman() : performs the regression-based Hausman test to test for endogeneity

Other miscellaneous post-estimation functions belong do the do-family of functions. Currently, three do functions are implemented:

doIPMA(): performs an importance-performance matrix analysis
doNonlinearEffectsAnalysis(): performs a nonlinear effects analysis such as floodlight and surface analysis
doRedundancyAnalysis(): performs a redundancy analysis

All functions require a cSEMResults object.

Example

Models are defined using lavaan syntax with some slight modifications (see the Specifying a model section on the cSEM-website). For illustration we use the build-in and well-known satisfaction dataset.

require(cSEM)

## Note: The operator "<~" tells cSEM that the construct to its left is modeled
##       as a composite.
##       The operator "=~" tells cSEM that the construct to its left is modeled
##       as a common factor.
##       The operator "~" tells cSEM which are the dependent (left-hand side) and
##       independent variables (right-hand side).

model <- "
# Structural model
EXPE ~ IMAG
QUAL ~ EXPE
VAL  ~ EXPE + QUAL
SAT  ~ IMAG + EXPE + QUAL + VAL 
LOY  ~ IMAG + SAT

# Composite model
IMAG <~ imag1 + imag2 + imag3
EXPE <~ expe1 + expe2 + expe3 
QUAL <~ qual1 + qual2 + qual3 + qual4 + qual5
VAL  <~ val1  + val2  + val3

# Reflective measurement model
SAT  =~ sat1  + sat2  + sat3  + sat4
LOY  =~ loy1  + loy2  + loy3  + loy4
"

The estimation is conducted using the csem() function.

# Estimate using defaults
res <- csem(.data = satisfaction, .model = model)
res

This is equal to:

csem(
   .data                        = satisfaction,
   .model                       = model,
   .approach_cor_robust         = "none",
   .approach_nl                 = "sequential",
   .approach_paths              = "OLS",
   .approach_weights            = "PLS-PM",
   .conv_criterion              = "diff_absolute",
   .disattenuate                = TRUE,
   .dominant_indicators         = NULL,
   .estimate_structural         = TRUE,
   .id                          = NULL,
   .iter_max                    = 100,
   .normality                   = FALSE,
   .PLS_approach_cf             = "dist_squared_euclid",
   .PLS_ignore_structural_model = FALSE,
   .PLS_modes                   = NULL,
   .PLS_weight_scheme_inner     = "path",
   .reliabilities               = NULL,
   .starting_values             = NULL,
   .tolerance                   = 1e-05,
   .resample_method             = "none", 
   .resample_method2            = "none",
   .R                           = 499,
   .R2                          = 199,
   .handle_inadmissibles        = "drop",
   .user_funs                   = NULL,
   .eval_plan                   = "sequential",
   .seed                        = NULL,
   .sign_change_option          = "none"
    )

The result is always a named list of class cSEMResults.

To access list elements use $:

res$Estimates$Loading_estimates 
res$Information$Model

A useful tool to examine a list is the listviewer package. If you are new to cSEM this might be a good way to familiarize yourself with the structure of a cSEMResults object.

listviewer::jsonedit(res, mode = "view") # requires the listviewer package.

Apply post-estimation functions:

rnorm(1) # necessary to initialize a .Random.seed object

## Get a summary
summarize(res) 

## Verify admissibility of the results
verify(res) 

## Test overall model fit
testOMF(res)

## Assess the model
assess(res)

## Predict indicator scores of endogenous constructs
predict(res)

Resampling and Inference

By default no inferential statistics are calculated since most composite-based estimators have no closed-form expressions for standard errors. Resampling is used instead. cSEM mostly relies on the bootstrap procedure (although jackknife is implemented as well) to estimate standard errors, test statistics, and critical quantiles.

cSEM offers two ways for resampling:

Setting .resample_method in csem() to "jackknife" or "bootstrap" and subsequently using post-estimation functions summarize() or infer().
The same result is achieved by passing a cSEMResults object to resamplecSEMResults() and subsequently using post-estimation functions summarize() or infer().

# Setting `.resample_method`
b1 <- csem(.data = satisfaction, .model = model, .resample_method = "bootstrap")
# Using resamplecSEMResults()
b2 <- resamplecSEMResults(res)

b1 <- csem(.data = satisfaction, .model = model, .resample_method = "bootstrap")

The summarize() function reports the inferential statistics:

summarize(b1)

Several bootstrap-based confidence intervals are implemented, see ?infer():

infer(b1, .quantity = c("CI_standard_z", "CI_percentile")) # no print method yet

Both bootstrap and jackknife resampling support platform-independent multiprocessing as well as setting random seeds via the future framework. For multiprocessing simply set .eval_plan = "multisession" in which case the maximum number of available cores is used if not on Windows. On Windows as many separate R instances are opened in the background as there are cores available instead. Note that this naturally has some overhead so for a small number of resamples multiprocessing will not always be faster compared to sequential (single core) processing (the default). Seeds are set via the .seed argument.

b <- csem(
  .data            = satisfaction,
  .model           = model, 
  .resample_method = "bootstrap",
  .R               = 999,
  .seed            = 98234,
  .eval_plan       = "multisession")