knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4, fig.align = "center" )
This document demonstrates how to use std_selected()
from
the stdmod
package to compute the correct
standardized solution of moderated regression.
More about this package can be found
in vignette("stdmod", package = "stdmod")
or at https://sfcheung.github.io/stdmod/.
library(stdmod) # For computing the standardized moderation effect conveniently
data(sleep_emo_con) head(sleep_emo_con, 3)
This data set has 500 cases of data. The variables are sleep duration, age, gender, and the scores from two personality scales, emotional stability and conscientiousness of the IPIP Big Five markers. Please refer to (citation to be added) for the detail of the data set.
The names of some variables are shortened for readability:
colnames(sleep_emo_con)[3:4] <- c("cons", "emot") head(sleep_emo_con, 3)
Suppose we are interested in predicting sleep duration by emotional stability, after controlling for gender and age. However, we suspect that the effect of emotional stability, if any, may be moderated by conscientiousness. Therefore, we conduct a moderated regression as follow:
lm_out <- lm(sleep_duration ~ age + gender + emot * cons, data = sleep_emo_con) summary(lm_out) plotmod(lm_out, x = "emot", w = "cons", x_label = "Emotional Stability", w_label = "Conscientiousness", y_label = "Sleep Duration")
The results show that conscientiousness significantly moderates the effect of emotional stability on sleep duration.
To get the correct standardized solution of the moderated regression, with the
product term formed after standardization, we can use std_selected()
.
The first argument is the regression output from lm()
.
The argument to_center
specifies variables to be mean
centered.
The argument to_scale
specifies variables to be rescaled
by their standard deviations after centering.
In stdmod
0.2.6.3, the argument to_standardize
was introduced
as a shortcut. Listing a variable in to_standardize
is
equivalent to listing it in to_center
and to_scale
.
If we want to standardize or mean center all variables, we can use ~ .
as a
shortcut. Note that std_selected()
will automatically skip categorical
variables (i.e., factors or string variables in the regression model of lm()
).
lm_stdall <- std_selected(lm_out, to_standardize = ~ .)
Before 0.2.6.3, to standardize all variables except for
categorical variables, we need to use both to_center = ~ .
and to_scale = ~ .
. Since 0.2.6.3,
we can just use to_standardize = ~ .
, as shown above.
If to_standardize = ~ .
does not work, just use
to_center
and to_scale
as shown below:
lm_stdall <- std_selected(lm_out, to_center = ~ ., to_scale = ~ .)
A summary of the results of std_selected()
can be
generated by summary()
:
summary(lm_stdall)
The coefficient in this solution,
r round(coef(lm_stdall)["emot:cons"], 5)
,
can be interpreted as the change in the standardized effect of
emotional stability for each one standard deviation increase of
conscientiousness. Naturally, this can be called the
standardized moderation effect of conscientiousness
(Cheung, Cheung, Lau, Hui, & Vong, 2022).
The output of std_selected()
can be passed to other functions that accept the
output of lm()
. This package also has a simple function,
plotmod()
, for generating a typical plot of the moderation effect:
plotmod(lm_stdall, x = "emot", w = "cons", x_label = "Emotional Stability", w_label = "Conscientiousness", y_label = "Sleep Duration")
The function plotmod()
also prints the conditional effects of the predictor
(focal variable), emotional stability in this example.
For comparison, this is the results of standardizing all variables, including the product term and the categorical variable.
library(lm.beta) # For generating the typical standardized solution packageVersion("lm.beta") lm_beta <- lm.beta(lm_out) summary(lm_beta)
The coefficient of the standardized product term is
r round(coef(lm_beta)["emot:cons"], 5)
, which
cannot be interpreted as the change in the standardized effect of
emotional stability for each one standard deviation increase of
conscientiousness because the product term is standardized and can no longer
be interpreted as the product of two variables in the model.
It has been shown (e.g., Yuan & Chan, 2011)
that the standard errors of
standardized regression coefficients computed just by standardizing the variables
are biased, and consequently the confidence intervals are also invalid. The
function std_selected_boot()
is a wrapper of std_selected()
that also
forms the confidence interval of the regression coefficients when standardizing
is conducted, using nonparametric bootstrapping as suggested by
Cheung, Cheung, Lau, Hui, and Vong (2022).
We use the same example above that standardizes all variables except for
categorical variables to illustrate this function. The argument nboot
specifies the number of nonparametric bootstrap samples.
The level of confidence is set by conf
. The default is .95, denoting 95%
confidence intervals. If this is the desired level, this argument can be
omitted.
if (file.exists("eg2_lm_xwy_std_ci.rds")) { lm_xwy_std_ci <- readRDS("eg2_lm_xwy_std_ci.rds") } else { set.seed(649017) lm_xwy_std_ci <- std_selected_boot(lm_out, to_center = ~ ., to_scale = ~ ., nboot = 2000) saveRDS(lm_xwy_std_ci, "eg2_lm_xwy_std_ci.rds", compress = "xz") }
set.seed(649017) lm_xwy_std_ci <- std_selected_boot(lm_out, to_standardize = ~ ., nboot = 2000)
If the default options are acceptable, the only additional argument is nboot
.
summary(lm_xwy_std_ci)
tmp <- summary(lm_xwy_std_ci)$coefficients
The standardized moderation effect is
r formatC(tmp["emot:cons", "Estimate"], 4, format = "f")
,
and the 95% nonparametric bootstrap confidence interval is
r formatC(tmp["emot:cons", "CI Lower"], 4, format = "f")
to
r formatC(tmp["emot:cons", "CI Upper"], 4, format = "f")
.
Note: As a side product, the nonparametric bootstrap percentile confidence of the other coefficients are also reported. They can be used for other variables that are standardized in the same model, whether they are involved in the moderation or not.
vignette("plotmod", package = "stdmod")
illustrates how to use plotmod()
to plot a moderation
effect. If variables are standardized by std_selected()
, plotmod()
can
indicate this in the plot.
vignette("cond_effect", package = "stdmod")
illustrates how to use cond_effect()
to compute
conditional effects, the effect of a predictor (focal variable) for selected
levels of the moderator.
cond_effect()
supports outputs from std_selected()
.
Cheung, S. F., Cheung, S.-H., Lau, E. Y. Y., Hui, C. H., & Vong, W. N. (2022) Improving an old way to measure moderation effect in standardized units. Health Psychology, 41(7), 502-505. https://doi.org/10.1037/hea0001188.
Yuan, K.-H., & Chan, W. (2011). Biases and standard errors of standardized regression coefficients. Psychometrika, 76(4), 670-690. https://doi.org/10.1007/s11336-011-9224-6
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.