knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
At the time of writing,
semhelpinghands
has these
groups of functions: manipulating
parameter estimates tables, comparing
results from different methods,
bootstrapping, and ... others.
This article will only introduce very briefly the groups. Some of them will be introduced in details in forthcoming article dedicated to them.
Let's load the package first, and
also load lavaan
.
library(semhelpinghands) library(lavaan)
In using lavaan
, I prefer reading the
output of parameterEstimates()
, which
is compact and clear to me. I sometimes
would like to organize the rows and
columns in ways meaningful to a
particular research questions. This
can certainly be done using base R
or dplyr
. However, I am lazy and want
to be able to do things using just one
or two functions, with just one or two
arguments.
This is a sample dataset for illustration,
dvs_ivs
,
with 3 predictors (x1
, x2
, and x3
),
3 outcome variables (y1
y2
, and y3
),
and a group variable (gp
).
First a single sample model:
data(dvs_ivs) mod <- " y1 ~ x1 + x2 + x3 y2 ~ x1 + x3 y3 ~ y2 + x2 " fit <- sem(model = mod, data = dvs_ivs, fixed.x = FALSE)
The parameter estimates tables:
est <- parameterEstimates(fit) est
A two-sample model is also fitted to the dataset:
fit_gp <- sem(model = mod, data = dvs_ivs, group = "gp", fixed.x = FALSE)
This is the parameter estimates table:
est_gp <- parameterEstimates(fit_gp) est_gp
So, these are what semhelpinghands
has for now:
add_sig()
Despite the controversy over null
hypothesis significance testing,
we still need to report them, for now.
They are there, but I have to decode
them mentally. Here comes add_sig()
:
add_sig(est)
If bootstrapping was used, it also supports using the confidence intervals, which may yield results different from the p-values when bootstrapping confidence intervals are used (but same results in this example):
add_sig(est, use = c("pvalue", "ci"))
It also works on the standardized solution:
std <- standardizedSolution(fit) add_sig(std)
Note: But be careful about the
interpretation of the p-values
in the standardized solution,
which are based on the delta method.
See vignette("standardizedSolution_boot_ci")
.
See the help page of add_sig()
for other options.
filter_by()
Its purpose is very simple, selecting
rows by commonly used columns: operators (op
),
"dependent variables" (lhs
), and
"independent variables" (rhs
).
filter_by(est, op = "~")
It also supports filtering by group using group labels:
filter_by(est_gp, op = "~", group = "gp1", fit = fit_gp)
See the help page of filter_by()
for other options.
group_by_dvs()
and group_by_ivs()
Sometimes I want the conventional
iv-column-dv-row or dv-column-iv-row
format. That's what group_by_dvs()
and group_by_ivs()
are for:
group_by_dvs(est) group_by_ivs(est)
They also supports extracting another column:
group_by_dvs(est, col_name = "pvalue") group_by_ivs(est, col_name = "pvalue")
See the help page of group_by_dvs()
and group_by_ivs()
for other options.
group_by_groups()
In multiple sample models, one common
task is comparing results across groups.
I wrote group_by_groups()
for this
task, to compare results side-by-side:
group_by_groups(est_gp)
It also supports extracting several columns:
group_by_groups(est_gp, col_names = c("est", "pvalue"))
If the fit object is used, it can print group labels:
group_by_groups(fit_gp, col_names = c("est", "pvalue"))
See the help page of group_by_groups()
for other options.
group_by_models()
This is inspired by the proposal
Rönkkö proposed in a GitHub
issue
for semTools
. I want something
simple for a quick overview and so
I wrote group_by_models()
.
Suppose this is the other model fitted:
mod2 <- " y1 ~ x1 + x2 + x3 y2 ~ x1 + x2 y3 ~ y2 + x1 " fit2 <- sem(model = mod2, data = dvs_ivs, fixed.x = FALSE) est2 <- parameterEstimates(fit2) est2
These two models have no nested
relationships. To compare the estimates,
group_by_models()
can be used:
group_by_models(list(Model1 = est, Model2 = est2))
It can also compare several columns:
group_by_models(list(Model1 = est, Model2 = est2), col_names = c("est", "pvalue"))
See the help page of group_by_models()
for other options.
sort_by()
The function sort_by()
can be used
to sort rows using (a) common fields
such as "op"
, "lhs"
, and "rhs"
,
(b) operators such as "~"
and "~~"
.
It can be used on the output of
some other functions that manipulate
a parameter estimates table.
out <- group_by_groups(est_gp, col_names = c("est", "pvalue")) out <- filter_by(out, op = c("~", "~~")) sort_by(out, by = c("op", "rhs"))
Some functions, such as group_by_models()
,
will automatically
call sort_by()
to sort the results.
The default order should be acceptable
in most cases, but can also be customized.
See the help page of sort_by()
on customizing the order.
Though not officially supported, piping using
|>
can be used with most of the
functions that manipulate
parameter estimates tables. For example:
est_gp |> add_sig() |> group_by_groups(col_names = c("est", "pvalue", "sig"), group_first = FALSE) |> filter_by(op = c("~"))
Though I believe the choice of the
estimation method should be justified,
suppose we want to assess the sensitivity
of the parameter estimate results to
the methods used, compare_estimators()
can be used as a quick way to compare
results from different methods.
out <- compare_estimators(fit, estimator = c("ML", "GLS", "MLR")) group_by_models(out, col_names = c("se", "pvalue"))
It simply refits the models for each
estimator and returns the results.
They can then be treated as different
"models" and processed by group_by_models()
.
se_ratios()
is a wrapper of group_by_models()
used to compare the standard errors
by different estimators in the output
of compare_estimators()
:
se_ratios(out, reference = "ML")
See the help page of compare_estimators()
for other options.
One issue with the standardized solution
is the confidence intervals. They are
based on the delta method even if
se = "boot"
is used. For indirect
effects, for which bootstrap confidence
intervals are commonly used, the
confidence intervals for them in
the standardized solution are not
what usually reported other tools
for mediation. There are some
other powerful tools on the Internet and
the [Google Group for lavaan], see
this thread,
to address this problem.
I wrote standardizedSolution_boot_ci()
not to replace them (it obviously can't),
but to address a very specific
case I usually encounter myself:
Generating the bootstrap confidence
intervals for standardized estimates
based on the bootstrap estimates already
generated by se = "boot"
.
Please see vignette("standardizedSolution_boot_ci")
for an illustration on standardizedSolution_boot_ci()
.
Another issue is examine bootstrap
estimates, such as the distribution
of the estimates. The function
plot_boot()
and related functions
can be used to compute bootstrap estimates
and plot them. The bootstrap estimates
of free parameters (stored by lavaan
),
user-defined parameters (computed by
store_boot_def()
), and the
standardized solution (computed by
store_boot_est_std()
) can be plotted.
Please see the article
on plot_boot()
on how to use this
function.
lavaan::lavaan()
and its wrappers suc has lavaan::sem()
and lavaan::cfa()
allow users to set several options
using an estimator
: ML
, GLS
, WLSMV
, and others.
However, it is not easy to remember
what the options set for an estimator.
Instead of finding them from the output
of summary()
, this function shows some
of them in one table for a quick overview.
This is an example:
data(dvs_ivs) mod <- " y1 ~ x1 + x2 + x3 y2 ~ x1 + x3 y3 ~ y2 + x2 " fit_default <- sem(model = mod, data = dvs_ivs) show_more_options(fit_default) fit_MLR <- sem(model = mod, data = dvs_ivs, estimator = "MLR") show_more_options(fit_MLR) fit_MLR_fiml <- sem(model = mod, data = dvs_ivs, estimator = "MLR", missing = "fiml") show_more_options(fit_MLR_fiml)
In structural equation modeling,
closed-form solution is rare and
optimization (minimization) is used
to find the/a solution. Out of curiosity
and teaching, I wrote a function
to capture the minimization history
so I can examine it or even plot
the process. The function,
record_history()
, is still in early
development but should work for now
for common simple scenarios.
Please refer the help page of
record_history()
to learn more about it.
In lavaan
, I rarely need to manually add
covariances between exogenous variables
(defined in a loose sense: variables appear
on the right-hand side but not on the left-hand
side of ~
). However, I came into a situation
in which lavaan
will not do this
(for good reasons). For example,
when a covariance between a residual
term and an exogenous variable is set
to free. I wrote two simple functions,
add_exo_cov()
and auto_exo_cov()
,
for this purpose. Please refer to
their help pages for further information.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.