knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) run_mplus <- FALSE
library(tidySEM) library(lavaan) library(dplyr)
tidySEM
offers a user-friendly, tidy workflow for generating syntax for SEM models. The workflow is top-down, meaning that syntax is generated based on conceptual model elements. In many cases, the generated syntax will suffice - but it is always customizable. The workflow also tries to intelligently guess which variables go together, but these defaults can be overridden.
The workflow underlying syntax generation in tidySEM
is as follows:
data
object short, informative names, that are easily machine readabletidy_sem
object by running model <- tidy_sem(data)
<!-- * The data_dictionary
called dict
can be used to generate keys(dict)
, for use with functions (reliability, factor analysis) from the psych
librarydata_dictionary
called dict
can be used to generate syntax for structural equation models in lavaan
or 'Mplus' (through MplusAutomation
)-->measurement(model)
dictionary
, data
, and syntax
elements in the tidy_sem
object by calling dictionary(model)
, get_data(model)
, or syntax(model)
dictionary
, data
, and syntax
elements in the tidy_sem
object dictionary(model) <- ...
, get_data(model) <- ...
, and syntax(model) <- ...
tidy_sem
object to lavaan
syntax using as_lavaan(model)
and using that as input for the lavaan
functions sem
, lavaan
, or cfa
tidy_sem
object to OpenMx
using as_ram(model)
, and using that as input for mxRun
or `run_mx``tidy_sem
object to Mplus
using as_mplus(model)
, and using that as input for MplusAutomation::mplusObject()
estimate_lavaan(model)
, estimate_mx(model)
, or estimate_mplus(model)
All elements of the tidy_sem
object are "tidy" data, i.e., tabular data.frames
, and can be modified using the familiar suite of functions in the 'tidyverse'. Thus, the data, dictionary, and syntax are all represented as data.frame
s.
As an example, let's make a graph for a classic lavaan
tutorial example for CFA. First, we check the data names:
df <- HolzingerSwineford1939 names(df)
These names are not informative, as the items named x..
are indicators of three different latent variables. We will rename them accordingly:
names(df)[grepl("^x", names(df))] <- c("vis_1", "vis_2", "vis_3", "tex_1", "tex_2", "tex_3", "spe_1", "spe_2", "spe_3")
In general, it is good practice to name variables using the following information:
tidySEM
expects to be separated from the remainder of the variable name by a splitting character (e.g., scale_item
)Roughly speaking, elements of the variable name should be ordered from "slow-changing" to "fast-changing"; i.e.; there are only a few scales, with possibly several measurement occasions or respondents, and many items.
A dictionary indicates which variables in the data belong to, for example, the same scale. When the data have informative names, it is possible to construct a data dictionary automatically:
model <- tidy_sem(df) model
We can automatically add basic syntax to the sem_syntax
object, by passing it to a syntax-generating function like measurement()
, which adds a measurement model for any scales in the object:
model |> measurement() -> model model
model <- measurement(model) model
The resulting model can be evaluated as 'lavaan' syntax, 'OpenMx' syntax, or 'Mplus' syntax, using the as_lavaan
, as_ram
, and as_mplus
functions. For example, using lavaan:
model |> estimate_lavaan()
estimate_lavaan(model)
The same model can be estimated in 'Mplus' through the R-package MplusAutomation
. This requires 'Mplus' to be installed.
estimate_mx(model) -> res_mx
model |> estimate_mx()
The same model can be estimated in 'Mplus' through the R-package MplusAutomation
. This requires 'Mplus' to be installed.
library(MplusAutomation) model |> estimate_mplus() -> res #dput(capture.output(summary(res)))
library(MplusAutomation) model |> estimate_mplus()
# Display the results cat(c("Estimated using ML ", "Number of obs: 301, number of (free) parameters: 30 ", "", "Model: Chi2(df = 24) = 85.306, p = 0 ", "Baseline model: Chi2(df = 36) = 918.852, p = 0 ", "", "Fit Indices: ", "", "CFI = 0.931, TLI = 0.896, SRMR = 0.06 ", "RMSEA = 0.092, 90% CI [0.071, 0.114], p < .05 = 0.001 ", "AIC = 7535.49, BIC = 7646.703 " ), sep = "\n")
The dictionary and syntax can be examined using dictionary(model)
and syntax(model)
:
dictionary(model)
syntax(model)
At this stage, we may want to modify the basic syntax slightly. The functions dictionary(model) <- ...
and syntax(model) <- ...
can be used to modify the dictionary and syntax:
dictionary(model) |> mutate(label = ifelse(label == "vis", "Visual", label))
tmp <- dictionary(model) mutate(tmp, label = ifelse(label == "vis", "Visual", label))
For example, imagine we want to change the model, so that all items of the "spe" subscale load on the "tex" latent variable. We would first replace the latent variable "spe" with "tex", and secondly remove all mention of the "spe" latent variable:
syntax(model) |> mutate(lhs = ifelse(lhs == "spe" & op == "=~", "tex", lhs)) |> filter(!(lhs == "spe" | rhs == "spe")) -> syntax(model)
tmp <- syntax(model) tmp <- mutate(tmp, lhs = ifelse(lhs == "spe" & op == "=~", "tex", lhs)) tmp <- filter(tmp, !(lhs == "spe" | rhs == "spe")) syntax(model) <- tmp
The modified model could then be run:
estimate_lavaan(model)
In addition to the way of editing the data.frame
with model syntax described in Step 6, it is also possible to add (or modify) paths by adding lavaan
syntax. For example, imagine that - instead of having "vis" and "tex" correlate, we want to add a regression path between them:
model |> add_paths("vis ~ tex") |> estimate_lavaan() |> summary(estimates = TRUE)
tmp <- add_paths(model, "vis ~ tex") tmp <- estimate_lavaan(tmp) summary(tmp, estimates = TRUE)
This function accepts both quoted (character) and unquoted arguments. So, for example, if we want to add a cross-loading from "spe_1" on "vis", in addition to the regression path before, we could use the following code:
model |> add_paths("vis ~ tex", vis =~ spe_1) |> estimate_lavaan()
tmp <- add_paths(model, "vis ~ tex", vis =~ spe_1) estimate_lavaan(tmp)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.