collapse = TRUE,
  comment = "#>"

semEff automates calculation of effects for structural equation models (SEM) which use local estimation ('piecewise'; see @lefcheckPiecewiseSEMPiecewiseStructural2016). Here's a short example using an SEM testing fire impacts on plant species richness (@graceStructuralEquationModel2006; data provided with the piecewiseSEM package).

# install.packages(c("semEff", "piecewiseSEM"))
data(keeley, package = "piecewiseSEM")

Let's have a look at the data. Observations represent a range of vegetation parameters measured in 90 post-fire study plots after wildfires in southern California in 1993:


An SEM fit to these data in Grace and Keeley (2006) tests the influence of landscape vs. local factors in the response of plants to fire, or more specifically, the direct vs. indirect effects of landscape location (distance from coast) on plant species richness via other biotic and abiotic mediators (see the paper for further details). We can reproduce this SEM (roughly) in a piecewise fashion by fitting a separate linear model for each observed response variable:

keeley.sem <- list(
  lm(age ~ distance, data = keeley),
  lm(hetero ~ distance, data = keeley),
  lm(abiotic ~ distance, data = keeley),
  lm(firesev ~ age, data = keeley),
  lm(cover ~ firesev, data = keeley),
  lm(rich ~ distance + hetero + abiotic + cover, data = keeley)

And visualise it using piecewiseSEM:::plot.psem() (wrapper for DiagrammeR::render_graph()), which helps to clarify the directions of causal pathways in the system:

  node_attrs = data.frame(shape = "rectangle", color = "black", fillcolor = "grey"),
  layout = "tree"

Now, to calculate effects, we first bootstrap and save the standardised direct effects:

  keeley.sem.boot <- bootEff(keeley.sem, R = 100, seed = 13, parallel = "no")

Note that here we use only 100 reps to save time, where typically we should use a lot more for greater accuracy of confidence intervals (e.g. 1,000---10,000+). Using the bootstrapped estimates we can now calculate direct, indirect, and total effects:

(keeley.sem.eff <- semEff(keeley.sem.boot))

Let's examine effects for the final response variable, plant species richness, which has four direct and five indirect paths leading to it:

summary(keeley.sem.eff, response = "rich")

We can see that the standardised total effect of distance from coast on plant species richness is moderately high at 0.453 (CIs: 0.373, 0.526), and is split almost equally between direct (0.233) and indirect (0.220) effects (involving five mediators). The mediator effects listed here are the sum of all the indirect paths from all predictors which involve each mediator, and can be useful as a rough indicator of the overall importance and net direction of effect of each. hetero, abiotic, and cover appear to be relatively important, with cover having the strongest influence (-0.155). However, this considers all predictors in the system collectively --- meaning that mediators are themselves counted as predictors. To focus only on the exogenous variable distance, let's recalculate:

  semEff(keeley.sem.boot, predictor = "distance"),
  response = "rich"

Now we can see that hetero and abiotic are the important mediators for distance, while the relative influence of cover is much lower --- weakened due to being part of a single longer chain of effect (involving age and firesev). The total indirect effect of 0.220 can thus be broken down into three individual paths:

  1. distance -> age -> firesev -> cover -> rich (0.015)

  2. distance -> hetero -> rich (0.104)

  3. distance -> abiotic -> rich (0.101)

This is only one potential way of analysing paths in this SEM. Depending on the size of the system and the particular research questions, any number of predictors, mediators and/or response variables can be investigated separately or collectively for direct vs. indirect effects. If there are multiple hypotheses (competing or complementary), these can all be tested in a single SEM by comparing the relative importance of different variables and pathways of effect.

NOTE: All effects presented here are standardised unique effects (i.e. adjusted for multicollinearity, a.k.a. semipartial correlations), which is the only way to fully partition effects in the system. These will usually be a bit smaller than the unadjusted standardised coefficients, since most variables are correlated to some degree. If there are large differences between the two, consideration should be given to the impact (and relevance) of multicollinearity in the system. In this particular case, it's minimal, with standard errors of coefficients less than twice as large as they should be (checked using RVIF() for the species richness model):


To use original standardised coefficients instead, specify unique.eff = FALSE, which can be passed to stdEff(), bootEff(), and/or semEff(). These can be preferred if prediction is the primary goal rather than inference, for comparison with the unique effects or with other standardised coefficients, or for other reasons.


Try the semEff package in your browser

Any scripts or data that you put into this service are public.

semEff documentation built on Oct. 12, 2021, 5:06 p.m.