knitr::opts_chunk$set( collapse = FALSE, comment = "", fig.align = "center", fig.height = 4, fig.width = 6 ) options(digits=3)
library(GAMLj3)
In this example we run a moderated regression analysis with simple slopes analysis and simple slopes graphs. Data are from Cohen et al 2003 and can be downloaded here.
The research is about physical endurance associated with age and physical exercise. 245 participants were measured while jogging on a treadmill. Endurance was measured in minutes ('yendu' in the file). Participants' age (xage
in years) and number of years of physical exercise (zexer
in years) were recorded as well
data<-read.csv("https://raw.githubusercontent.com/mcfanda/gamlj_docs/master/data/exercise.csv") summary(data[,c("xage","zexer","yendu")])
The researcher is interested in studying the relationships between endurance, age, and exercising, with the hypothesis that the effect of age (expected to be negative) is moderated by exercise, such that the more participants work out (higher levels of exer
) the less age negatively affects endurance.
We can think about this analytic problem as a multiple regression, where the effect of age and exercise can be estimated while keeping constant the other variable. However, the researcher puts forward a moderation hypothesis, because s/he expects the effect of age to change for different levels of exercising. We than need an interaction between age and exercise.
We first run a multiple regression (to warm up), then we estimate a multiple regression with an interaction (moderated regression) and we probe the interaction with a simple slope analysis and simple slope graphs. Technical details can be found in Cohen et al 2003, or in Preacher website.
GAMLj gamlj_lm()
function minimum setup only requires to specify the formula of the model and the dataframe. The formula interface is the same one used in lm()
function.
mod1<-gamlj_lm(formula = yendu~xage+zexer, data=data)
mod1
The object `mod' is an R6 object which contains all the results tables and some additional function that can be used in subsequent analyses. Tables are pretty printed, but they can be modified or manipulated by transforming them in dataframes as follows:
anovaDF<-mod1$main$anova$asDF anovaDF
Results show three tables. The Model Info table contains information about the overall model. The ANOVA omnibus tests table contains the results of the car:Anova()
function, with the addition of the partial $\eta^2$ index for each effect and the inferential test for the whole model. The Fixed Effecta Parameter Estimates table contains the summary()
results. For each coefficient the confidence interval is also reported.
A special note should be made for the intercept. The intercept is the expected value (the mean) of the dependent variable, estimated for all independent variables equal to their means. This is because in gamlj_lm()
, continuous variables are centered to their mean by default. In case one wants the independent variables not to be centered, one can select a different scaling with the option scaling
.
Additional effect size indexes can be asked with the option effectSize
.
mod2<-gamlj_lm(formula = yendu~xage+zexer, data=data, es = c("beta", "eta","etap"))
mod2$main$anova
The same analysis can be done by updating the model with the update()
function. Almost all the options available in gamlj_lm()
can be added to a model by running update(mod,...)
where ...
is any option or options accepted by gamlj_lm()
.
mod2_2<-update(mod1,es = c("beta", "eta","etap","omega","epsilon"))
mod2_2$main$anova
To include the interaction we simply add the interaction effect in the formula.
mod3<-gamlj_lm(formula = yendu~xage*zexer, data=data, es = c("beta", "eta","etap"))
mod3$main$anova
Because variables are centered to their means, the first-order coefficients can be interpreted as "average" effects. One can also report the betas ($\beta$). The estimates of the betas are correct also in the presence of the interaction, because the variables are standardized before the interaction term is computed.
We can now probe the interaction. One can re-run the model adding the appropriate options to ask for simple effects or one can use the gamlj_simpleffects()
function, which is a convenience function to add simple effects to a pre-existing model. The function gamlj_simpleffects()
, however, only returns the simple effects tables, not the full model.
mod3b<-gamlj_lm(formula = yendu~xage*zexer, data=data, simple_x = "xage", simple_mods = "zexer")
mod3b$simpleEffects
Equivalently, we can do use the command simple_effects()
:
se<-simple_effects(mod3,simple_x = "xage",simple_mods = "zexer")
se
In this way we obtain the effect of age computed for high exercise (zexer
centered to 1 SD above average), the main effect of age (zexer
centered to its mean) and the effect of age computed for low exercise (zexer
centered to -1 SD above average). gamlGLM()
produces both the F-tests and the parameter estimates for the simple slopes. We focus on the latter table now.
One can change the conditioning levels of the moderators with the covs_conditioning
option (default is mean_sd
for mean plus/minus one SD), either added to the gamlj_lm()
function or to the simple_effects()
function. If one wants to use the percentiles (25%,50%,75%), for instance, one can run the following.
se<-simple_effects(mod3,simple_x = "xage",simple_mods = "zexer",covs_conditioning="percent")
se
The simple effects are now changed, because they are estimated for a different set of values of the moderator.
One can further tweak the appearance of the tables by selecting a different value/labels in covs_scale_labels
option. Options are "labels", "values" and "values_labels". The latter outputs the values and the labels of the conditioning values.
se<-simple_effects(mod3,simple_x = "xage",simple_mods = "zexer",covs_conditioning="percent",covs_scale_labels="values_labels")
se
We can get a clear picture of the interaction by asking for a plot. Also the plot module takes care of centering the variables in a way that makes the plot clearly understandable.
The options needed in gamlj_lm()
are plot_x
for the x-axis variable and plot_z
for the moderator. At which three levels of the moderator the separate lines are computed is decided by the option simpleScale
as for the simple effects.
mod4<-gamlj_lm(formula = yendu~xage*zexer, data=data, plot_x = "xage",plot_z= "zexer") mod4 plot(mod4)
We use theplot()
function. The function, applied to a gamlj
results object, returns one plot if it is present in the model, returned as a ggplot2 object. If more than one plot is present, a list of plots is returned. FALSE is returned if no plot is present or defined. The function plot()
can also be use to add new plots or to add options to the plots.
For instance, if we want to give a more honest account of the model fit, we can visualize the simple slopes over the the actual data. The function plot()
produces a new plot after adding any options accepted by gamlj_lm()
plot(mod4,plot_raw=T)
Any plot produced by gamlj_lm
or plot()
can be obtained as a ggplot2 object for further manipulations or usage. For instance, one can change the theme of the plot:
myplot<-plot(mod4) myplot+ggplot2::theme_grey()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.