knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 6, fig.align = "center", fig.path = "" )
This article illustrates how to
use model_set()
and other functions
from the package
modelbpp
to:
Fit a set of neighboring models, each has one more or one less degree of freedom than the original fitted model.
Compute the BIC posterior probability (BPP), for each model [@wu_simple_2020].
Use BPP to assess to what extent each model is supported by the data, compared to all other models under consideration.
Fit an SEM model, the
original model, as usual in lavaan
.
Call model_set()
on the output from
Step 1. It will automatically do the
following:
Enumerate the neighboring models of the original model.
Fit all the models and compute their BIC posterior probabilities (BPPs).
Examine the results by:
printing the output of model_set()
, or
generating a graph using model_graph()
.
This is a sample dataset,
dat_serial_4_weak
,
with four variables:
library(modelbpp) head(dat_serial_4_weak)
Fit this original model, a serial
mediation model, with one direct path,
from x
to y
:
library(lavaan) mod1 <- " m1 ~ x m2 ~ m1 y ~ m2 + x " fit1 <- sem(mod1, dat_serial_4_weak)
This the summary:
summary(fit1, fit.measures = TRUE)
tmp <- fitMeasures(fit1) fit1_cfi <- unname(tmp["cfi"]) fit1_rmsea <- unname(tmp["rmsea"])
The fit is acceptable, though the RMSEA
is marginal (CFI = r formatC(fit1_cfi, 3, format = "f")
,
RMSEA = r formatC(fit1_rmsea, 3, format = "f")
).
model_set()
Use model_set()
to find the
neighboring models differ
from the target model by one on model
degrees of freedom, fit them, and compute
the BPPs:
out1 <- model_set(fit1)
To examine the results, just print the output:
out1
out1_bpp <- out1$bpp out1_bpp_2 <- sort(out1_bpp, decreasing = TRUE)[2]
The total number of models examined,
including the original model, is r length(out1$models)
.
(Note: The total number of models was 9 in previous version. Please refer to the Note in the printout for the changes.)
The BIC posterior probabilities
(BPPs) suggest that
the original model is indeed the most
probable model based on BPP. However,
the model with the direct path
dropped, drop: y~x
, only
has slightly lower BPP
(r formatC(out1_bpp_2, 3, format = "f")
)
This suggests that, with equal prior probabilities [@wu_simple_2020], the support for the model with the direct and without the direct path have similar support from the data based on BPP.
Alternatively, we can use model_graph()
to visualize the BPPs and model relations
graphically:
graph1 <- model_graph(out1) plot(graph1)
Each node (circle) represents one model. The larger the BPP, the larger the node.
The arrow points from a simpler model (a model with larger model df) to a more complicated model (a model with smaller model df). If two models are connected by an arrow, then one model can be formed from another model by adding or removing one free parameter (e.g., adding or removing one path).
In real studies, not all models are equally probable before having data (i.e., not all models have equal prior probabilities). A researcher fits the original model because
its prior probability is higher than other models, at least other neighboring models (otherwise, it is not worthy of collecting data assess thi original model), but
the prior probability is not so high to eliminate the need for collecting data to see how much it is supported by data.
Suppose we decide that the prior probability of the original model is .50: probable, but still needs data to decide whether it is empirically supported
This can be done by setting prior_sem_out
to the desired prior probability when calling
model_set()
:
out1_prior <- model_set(fit1, prior_sem_out = .50)
The prior probabilities of all other models are equal. Therefore, with nine models and the prior of the target model being .50, the prior probability of the other eight model is (1 - .50) / 8 or .0625.
This is the printout:
out1_prior
If the prior of the target is set to .50, then, taking into account both the prior probabilities and the data, the target model is strongly supported by the data.
This is the output of model_graph()
:
graph1_prior <- model_graph(out1_prior) plot(graph1_prior)
If desired, we can enumerate models "farther away" from the target model. For example, we can set the maximum difference in model df to 2, to include models having two more or two less df than the original model:
out1_df2 <- model_set(fit1, df_change_add = 2, df_change_drop = 2)
This is the printout. By default, when there are more than 20 models, only the top 20 models on BPP will be printed:
out1_df2
The number of models examined, including
the original model, is r length(out1_df2$models)
.
This is the output of model_graph()
:
graph1_df2 <- model_graph(out1_df2, node_label_size = .75) plot(graph1_df2)
Note: Due to the number of nodes,
node_label_size
is used to reduce
the size of the labels.
When calling model_set()
, users can
specify parameters that must be excluded
from the list to be added (must_not_add
),
or must not be dropped (must_not_drop
).
For example, suppose it is well
established that m1~x
exists and should
not be dropped, we can exclude it
when calling model_set()
out1_no_m1_x <- model_set(fit1, must_not_drop = "m1~x")
This is the output:
out1_no_m1_x
The number of models reduced to r length(out1_df2$models)
.
This is the plot:
out1_no_m1_x <- model_graph(out1_no_m1_x) plot(out1_no_m1_x)
If the original model has equality constraints, they will be included in the search for neighboring models, by default. That is, removing one equality constraint between two models is considered as a model with an increase of 1 df.
Users can examine the impact of the prior
probability of the original model without
refitting the models, by using
the output of model_set()
as the
input, using the model_set_out
argument:
out1_new_prior <- model_set(model_set_out = out1, prior_sem_out = .50)
The results are identical to calling
model_set()
with the original lavaan
output as the input:
out1_new_prior
When a model has a lot of free parameters,
the number of neighboring models can be
large and it will take a long time to
fit all of them. Users can enable
parallel processing by setting parallel
to TRUE
when calling model_set()
.
Please refer to the help page of
model_set()
for options available.
For further information on other functions, please refer to their help pages.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.