knitr::opts_chunk$set(
    echo = TRUE,
    message = FALSE,
    warning = FALSE
)

#necessary to render tutorial correctly
library(learnr) 
library(htmltools)
#tidyverse
library(dplyr)
library(ggplot2)
#non tidyverse
library(afex)
library(BayesFactor)
library(broom)
library(car)
library(effectsize)
library(emmeans)
library(ggfortify)
library(Hmisc)
library(knitr)
library(modelbased)
library(parameters)
library(sandwich)


source("./www/discovr_helpers.R")


#Read data files needed for the tutorial

goggles_tib <- discovr::goggles
xbox_tib <- discovr::xbox |> 
  dplyr::mutate(
    game = forcats::as_factor(game) |> forcats::fct_relevel("Static"),
    console = forcats::as_factor(console) |> forcats::fct_relevel("Xbox One")
  )
# Create bib file for R packages
here::here("inst/tutorials/discovr_13/packages.bib") |>
  knitr::write_bib(c('here', 'tidyverse', 'dplyr', 'readr', 'forcats', 'tibble', 'knitr', 'afex', 'BayesFactor', 'broom', "car", 'effectsize', 'emmeans', "ggfortify", 'Hmisc', 'modelbased', 'parameters', 'sandwich'), file = _)

discovr: Factorial designs (GLM 3)

Overview

discovr package hex sticker, female space pirate with gun. Gunsmoke forms the letter R. **Usage:** This tutorial accompanies [Discovering Statistics Using R and RStudio](https://www.discovr.rocks/) [@field_discovering_2023] by [Andy Field](https://en.wikipedia.org/wiki/Andy_Field_(academic)). It contains material from the book so there are some copyright considerations but I offer them under a [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License](http://creativecommons.org/licenses/by-nc-nd/4.0/). Tl;dr: you can use this tutorial for teaching and non-profit activities but please don't meddle with it or claim it as your own work.

r cat_space(fill = blu) Welcome to the discovr space pirate academy

Hi, welcome to discovr space pirate academy. Well done on embarking on this brave mission to planet r rproj()s, which is a bit like Mars, but a less red and more hostile environment. That's right, more hostile than a planet without water. Fear not though, the fact you are here means that you can master r rproj(), and before you know it you'll be as brilliant as our pirate leader Mae Jemstone (she's the badass with the gun). I am the space cat-det, and I will pop up to offer you tips along your journey.

On your way you will face many challenges, but follow Mae's system to keep yourself on track:

It's not just me that's here to help though, you will meet other characters along the way:

Also, use hints and solutions to guide you through the exercises (Figure 1).

Each codebox has a hints or solution button that activates a popup window containing code and text to guide you through each exercise.
Figure 1: In a code exercise click the hints button to guide you through the exercise.

By for now and good luck - you'll be amazing!

Workflow

Packages

This tutorial uses the following packages:

It also uses these tidyverse packages [@R-tidyverse; @tidyverse2019]: dplyr [@R-dplyr], forcats [@R-forcats], ggplot2 [@wickhamGgplot2ElegantGraphics2016], and readr [@R-readr], `.

Coding style

There are (broadly) two styles of coding:

  1. Explicit: Using this style you declare the package when using a function: package::function(). For example, if I want to use the mutate() function from the package dplyr, I will type dplyr::mutate(). If you adopt an explicit style, you don't need to load packages at the start of your Quarto document (although see below for some exceptions).

  2. Concise: Using this style you load all of the packages at the start of your Quarto document using library(package_name), and then refer to functions without their package. For example, if I want to use the mutate() function from the package dplyr, I will use library(dplyr) in my first code chunk and type the function as mutate() when I use it subsequently.

Coding style is a personal choice. The Google r rproj() style guide and tidyverse style guide recommend an explicit style, and I use it in teaching materials for two reasons (1) it helps you to remember which functions come from which packages, and (2) it prevents clashes resulting from using functions from different packages that have the same name. However, even with this style it makes sense to load tidyverse because the dplyr and ggplot2 packages contain functions that are often used within other functions and in these cases explicit code is difficult to read. Also, no-one wants to write ggplot2:: before every function from ggplot2.

You can use either style in this tutorial because all packages are pre-loaded. If working outside of the tutorial, load the tidyverse package (and any others if you're using a concise style) at the beginning of your Quarto document:

library(tidyverse)

I try to follow the Google r rproj() style guide and tidyverse style guide in always declaring the package when using a function: package::function(). For example, if I want to use the mutate() function from the package dplyr, I will write dplyr::mutate().

It is good practice to be explicit about packages to avoid clashes where functions from different packages have the same name. It also means that you don't need to load packages at the start of your Quarto document.

There are two main exceptions to this rule.

  1. There are functions within some tidyverse packages that would be used within other functions. Including the package name makes the code difficult to read. Also, no-one wants to write ggplot2:: before every function from ggplot2.
  2. To use the pipe operator (|>) you need to have magrittr loaded.

We can load all of the packages that are exceptions in one step by loading tidyverse at the beginning of our Quarto document:

library(tidyverse)

Data

To work outside of this tutorial you need to download the following data files:

Set up an r rstudio() project in the way that I recommend in this tutorial, and save the data files to the folder within your project called [data]{.alt}. Place this code in the first code chunk in your Quarto document:

goggles_tib <- here::here("data/goggles.csv") |>
  readr::read_csv() |>
  dplyr::mutate(
    facetype = forcats::as_factor(facetype),
    alcohol = forcats::as_factor(alcohol)
  )

This code reads in the data and converts the variables alcohol and facetype to a factor (categorical variable). For the xbox data use the code (in which I have also been explicit in setting certain categories as the first level of the factor):

xbox_tib <- here::here("data/xbox.csv") |>
  readr::read_csv() |>
  dplyr::mutate(
    game = forcats::as_factor(game) |> forcats::fct_relevel("Static"),
    console = forcats::as_factor(console) |> forcats::fct_relevel("Xbox One")
  )

r bmu() Beer goggles [(2)]{.alt}

The main example in this tutorial is from [@field_discovering_2023], who uses an example of an experimental design with two independent variables (a two-way independent design). The study tested the prediction that subjective perceptions of physical attractiveness become inaccurate after drinking alcohol (the well-known beer-goggles effect). The example is based on research that looked at whether the beer-goggles effect was influenced by the attractiveness of the faces being rated [@chen_moderating_2014]. The logic is that alcohol consumption has been shown to reduce accuracy in symmetry judgements, and symmetric faces have been shown to be rated as more attractive. If the beer-goggles effect is driven by alcohol impairing symmetry judgements then you'd expect a stronger effect for unattractive (asymmetric) faces (because alcohol will affect the perception of asymmetry) than attractive (symmetric) ones. The data well analyse are fictional, but the results mimic the findings of this research paper.

An anthropologist was interested in the effects of facial attractiveness on the beer-goggles effect. She selected 48 participants who were randomly subdivided into three groups of 16: (1) a placebo group drank 500 ml of alcohol-free beer; (2) a low-dose group drank 500 ml of average strength beer (4% ABV); and (3) a high-dose group drank 500 ml of strong beer (7% ABV). Within each group, half (n = 8) rated the attractiveness of 50 photos of unattractive faces on a scale from 0 (pass me a paper bag) to 10 (pass me their phone number) and the remaining half rated 50 photos of attractive faces. The outcome for each participant was their median rating across the 50 photos (These photographs were from a larger pool of 500 that had been pre-rated by a different sample. The 50 photos with the highest and lowest ratings were used.). The data are in [goggles_tib]{.alt}, which contains the variables facetype (unattractive vs attractive), alcohol (placebo, low dose, high dose) and attractiveness (the median rating of each participant out of 10).

r alien() Alien coding challenge

View the data in [goggles_tib]{.alt}.


goggles_tib

Note that there are four variables: the participant's id, which is a character variable (note the []{.alt} under the name), the facetype in the photo (unattractive or attractive) and the alcohol consumption (placebo, low or high), both of which are factors (note the []{.alt} under the names). Finally, the attractiveness score is numeric and has the data type 'double' (note the []{.alt} under the name).

The variables facetype and alcohol are factors (categorical variable), so having read the data file and converted these variables to factors it's a good idea to check that the levels of these variables are in the order that we want: unattractive and attractive for facetype and placebo, low high for alcohol.

r alien() Alien coding challenge

Using what you've learnt in previous tutorials check the order of the levels of the variables facetype and alcohol.


# use this function:
levels()
# Remember that to access a variable you use:

name_of_tibble$name_of_variable
levels(goggles_tib$facetype)
levels(goggles_tib$alcohol)

Because I have set up the data within this tutorial you should see that the levels are listed in the order that we want them when you execute the code.

r bmu() Exploring the data [(1)]{.alt}

r alien() Alien coding challenge

Use what you already know to create an object called [goggles_sum]{.alt} that contains the mean and a 95% confidence interval of attractiveness scores split by the type of face in the stimulus and the alcohol consumption. Print this object rounding to 2 decimal places and with a caption.


# Start by piping the tibble into the group_by function to group output by facetype and alcohol:
goggles_sum <- goggles_tib |> 
  dplyr::group_by(facetype, alcohol)
# Now pipe the results into the summarize() function
# Pipe the results into the summarize() function
goggles_sum <- goggles_tib |> 
  dplyr::group_by(facetype, alcohol) |> 
  dplyr::summarize()
# Within summarize(), use the mean() function to create a variable that is the mean attractiveness score
# Use the mean() function to create a variable that is the mean attractiveness score:
goggles_sum <- goggles_tib |> 
  dplyr::group_by(facetype, alcohol) |> 
  dplyr::summarize(
    mean = mean(attractiveness, na.rm = TRUE)
  )
# Add two more rows that use mean_cl_normal to calculate the lower and upper boundary of the 95% confidence interval
goggles_sum <- goggles_tib |> 
  dplyr::group_by(facetype, alcohol) |> 
  dplyr::summarize(
    mean = mean(attractiveness, na.rm = TRUE),
    `95% CI lower` = mean_cl_normal(attractiveness)$ymin,
    `95% CI upper` = mean_cl_normal(attractiveness)$ymax
  )
# print to 2 decimal places
goggles_sum <- goggles_tib |> 
  dplyr::group_by(facetype, alcohol) |> 
  dplyr::summarize(
    mean = mean(attractiveness, na.rm = TRUE),
    `95% CI lower` = mean_cl_normal(attractiveness)$ymin,
    `95% CI upper` = mean_cl_normal(attractiveness)$ymax
  )

goggles_sum |> 
  knitr::kable(digits = 2, caption = "Summary statistics for the beer goggles data")

Note that the mean attractiveness is very similar across the doses of alcohol for the attractive faces, but varies in the unattractive faces.

r alien() Alien coding challenge

Use what you already know to plot the mean and a 95% confidence interval of attractiveness scores split by the type of face (x-axis) being rated and the alcohol consumption (colour).


# Start by setting up the plot (replace the xs):
ggplot2::ggplot(xxxxx, aes(x = xxxxx, y = xxxxx, colour = xxxxx)) 
# Now use stat_summary() to add the data (replace the xs)
ggplot2::ggplot(goggles_tib, aes(x = alcohol, y = attractiveness, colour = facetype)) +
  stat_summary(fun.data = "xxxxx", geom = "xxxxxx", position = position_xxxxx(width = 0.2))
# Use coord_cartesian() and scale_y_continuous to set the limits and breaks for the y-axis to be whole numbers between 0 and 10 (replace the xs):
ggplot2::ggplot(goggles_tib, aes(x = alcohol, y = attractiveness, colour = facetype)) +
  stat_summary(fun.data = "mean_cl_normal", geom = "pointrange", position = position_dodge(width = 0.2)) +
  coord_cartesian(xxxxxx) +
  scale_y_continuous(xxxxx)
# use labs() to add axis labels to the x, y and colour legend (replace xxxs):
ggplot2::ggplot(goggles_tib, aes(x = alcohol, y = attractiveness, colour = facetype)) +
  stat_summary(fun.data = "mean_cl_normal", geom = "pointrange", position = position_dodge(width = 0.2)) +
  coord_cartesian(ylim = c(0,10)) +
  scale_y_continuous(breaks = 0:10) +
  labs(x = "xxxxxxx", y = "xxxxxxx", xxxxx = "Type of face")
# add a theme (replace the xs):
ggplot2::ggplot(goggles_tib, aes(x = alcohol, y = attractiveness, colour = facetype)) +
  stat_summary(fun.data = "mean_cl_normal", geom = "pointrange", position = position_dodge(width = 0.2)) +
  coord_cartesian(ylim = c(0,10)) +
  scale_y_continuous(breaks = 0:10) +
  labs(x = "Alcohol consumption", y = "Attractiveness (0-10)", colour = "Type of face") +
  xxxxxx()
# Solution
ggplot2::ggplot(goggles_tib, aes(x = alcohol, y = attractiveness, colour = facetype)) +
  stat_summary(fun.data = "mean_cl_normal", geom = "pointrange", position = position_dodge(width = 0.2)) +
  coord_cartesian(ylim = c(0,10)) +
  scale_y_continuous(breaks = 0:10) +
  labs(x = "Alcohol consumption", y = "Attractiveness (0-10)", colour = "Type of face") +
  theme_minimal()

Note again that the mean attractiveness is very similar across the doses of alcohol for the attractive faces, but varies in the unattractive faces.

r user_visor() The model [(2)]{.alt}

The model we're fitting is described by the following equation:

$$ \begin{aligned} Y_i & = \hat{b}_0 + \hat{b}_1X_i+ e_i\ \text{attractiveness}_i & = \hat{b}_0 + \hat{b}_1\text{facetype}_i + \hat{b}_2\text{alcohol}_i + \hat{b}_3[\text{facetype} \times \text{alcohol}]_i + e_i \end{aligned} $$

We can include categorical predictors in the linear model using dummy variables and the function lm() as we have done before. However, when looking at F-statistics for the overall effect of predictors things get quite complicated as we saw in the previous tutorial. The lm() method is useful for when you want control over what the parameter estimates, but it's complicated. A simpler method is to use the afex package. We'll start with this method.

r bmu() Fitting the model using the afex package [(1)]{.alt}

When faced with a factorial design (i.e. only categorical predictors) a more user-friendly approach is offered by the afex package (stands for [a]{.alt}nalysis of [f]{.alt}actorial [ex]{.alt}periments). We're going to use the aov_4() function, which the the following format:

afex::aov_4(outcome ~ predictors + (1|id_variable), data = my_tib)

In short, we specify the model like we would with lm(), and replace [my_tib]{.alt} with the name of our tibble. There is an additional term in the model, [(1|id_variable)]{.alt}, and this tells the function something about the structure of the data. Specifically, it tells the function how scores are clustered. In the current design, scores are clustered within individuals so we would replace [id_variable]{.alt} with the variable that uniquely identifies the different participants (this variable is called [id]{.alt}). In subsequent tutorials we'll use this argument to specify different types of designs.

r robot() Code example

Putting all of this together, we could fit the model with this code:

goggles_afx <- afex::aov_4(attractiveness ~ facetype*alcohol + (1|id), data = goggles_tib)
goggles_afx

r alien() Alien coding challenge

Use the aov_4() function to fit the model.


# fit the model (replace the xs):
goggles_afx <- afex::aov_4(xxxxx ~ xxxxx*xxxxx + (1|xxxxx), data = xxxxx)
# fit the model:
goggles_afx <- afex::aov_4(attractiveness ~ facetype*alcohol + (1|id), data = goggles_tib)
goggles_afx #this shows us the model
quiz(caption = "Interpreting factorial designs",
  question("Using the output above, interpret the effect of **facetype**.",
    answer("This effect means that overall when we ignore how much alcohol had been drunk the type of face being rated significantly affected attractiveness ratings.", correct = T, message = "The main effect of type of face is significant because the *p* associated with the *F*-statistic is given as < .001, which is less than 0.05"),
    answer("This effect means that overall when we ignore how much alcohol had been drunk the type of face being rated did not significantly affect attractiveness ratings", message = "Sorry, but this answer is incorrect. The main effect of type of face would be considered *significant* because the *p* associated with the *F*-statistic is given as < .001, which is less than 0.05"),
    answer("Tell me a terrible cat joke", message = "What do you use to brush a cat? A catacomb."),
    correct = "Correct - well done!",
    incorrect = "",
    random_answer_order = TRUE,
    allow_retry = T
  ),
    question("Using the output above, interpret the effect of **alcohol**.",
    answer("This effect means that when we ignore whether the participant rated unattractive or attractive faces the amount of alcohol significantly influenced their attractiveness ratings.", correct = T, message = "The main effect of alcohol is significant because the *p* associated with the *F*-statistic is given as 0.005, which is less than 0.05"),
    answer("This effect means that overall when we ignore how much alcohol had been drunk the type of face being rated did not significantly affect attractiveness ratings", message = "The main effect of alcohol would be considered *significant* because the *p* associated with the *F*-statistic is given as 0.005, which is less than 0.05"),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Using the output above, interpret the significant effect of **facetype:alcohol** (select ALL that apply).",
    answer("The effect of alcohol on attractiveness ratings was different when rating unattractive faces compared to when rating attractive faces.", correct = T),
    answer("The difference between the mean ratings of attractive and unattractive faces varied as a function of how much alcohol was consumed.", correct = T),
    answer("The difference between the mean ratings of attractive and unattractive faces was similar in each alcohol group."),
    answer("The difference between the mean ratings across the three alcohol groups was similar for attractive and unattractive faces."),
    answer("Attractiveness ratings were similar regardless of how much alcohol was consumed and whether the face was attractive or not."),
    correct = "Correct - well done!",
    incorrect = "Incorrect. Hint: The fact that the interaction effect was significant suggests that the effect of the type of face depended on how much alcohol was consumed and vice versa.",
    random_answer_order = TRUE,
    allow_retry = T
  )
)

The results show significant main effects and interactions for all variables. The main effects are not interesting in the context of the significant interaction effect, so we'll focus on the significant interaction effect. We can interpret this interaction effect using a plot.

`r bug()` **De-bug: Strange messages** You might be distressed by a message telling you wzxhzdk:31 This message is nothing to worry about, but if you want to know what it's all about work through the section on fitting the model using `lm()` where all is revealed.

r bmu() Plots using the afex package [(1)]{.alt}

A neat feature of afex is that you can get plots of the interaction without needing to do it using ggplot2 (even though I forced you to do that for practice at the start of the tutorial). Having created an afex object we can feed it into the afex_plot(), which takes the general form:

afex::afex_plot(afx_object, x_variable, line_shape_variable, panel_variable)

In which you replace [afx_object]{.alt} with the name of the model you fitted with aov_4(), [x_variable]{.alt} with the predictor you want on the x-axis, and [line_shape_variable]{.alt} with a predictor that you want to be depicted using different lines/shapes. If you have a third categorical predictor, replace [panel_variable]{.alt} with the name of that predictor and its different categories will be displayed across different panels (e.g., facet_wrap() style). The result is a ggplot object so you can use ggplot2 code to edit the results, for example, you can apply a standard ggplot2 theme.

r robot() Code example

To plot the [facetype*alcohol]{.alt} interaction we could use this code:

afex::afex_plot(goggles_afx, "alcohol", "facetype")

r alien() Alien coding challenge

Plot the [facetype*alcohol]{.alt} interaction, add axis labels and apply theme_minimal().

goggles_afx <- afex::aov_4(attractiveness ~ facetype*alcohol + (1|id), data = goggles_tib)

# Plot the means (replace the xs):
afex::afex_plot(xxxxx, "xxxxx", "xxxxx")
afex::afex_plot(goggles_afx, "alcohol", "facetype")
# now add axis labels to override the defaults in the usual way
# plot the means and add axis labels:
afex::afex_plot(goggles_afx, "alcohol", "facetype") +
  labs(x = "Alcohol consumption", y = "Attractiveness rating (0-10)")
# now add theme_minima() in the usual way
# solution:
afex::afex_plot(goggles_afx, "alcohol", "facetype") +
  labs(x = "Alcohol consumption", y = "Attractiveness rating (0-10)") +
  theme_minimal()

The plot seems to suggest that attractive stimuli are rated as more attractive than unattractive stimuli in the placebo group, but are rated fairly similarly in the low dose group and very similarly in the high alcohol dose group. Put another way, the beer googles effect seems to being rearing it's head: as more alcohol is consumed the ratings of unattractive faces get more similar to those of attractive faces.

goggles_afx <- afex::aov_4(attractiveness ~ facetype*alcohol + (1|id), data = goggles_tib)
gog_afx_tbl <- goggles_afx$anova_table
`r pencil()` **Report**`r rproj()` There was a significant effects of the type of face used, `r report_afx(gog_afx_tbl, row = 1)`, and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 2)`. However, these effects were superseded by a significant interaction between the type of face being rated and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 3)`. This interaction suggests that the effect of alcohol is moderated by the type of face being rated (and vice versa). Based on the means (see plot) this interaction supports the 'beer-googles' hypothesis: when no alcohol is consumed symmetric faces were rated as more attractive than asymmetric faces but this difference diminishes as more alcohol is consumed.

r bmu() Estimated marginal means [(1)]{.alt}

It's also fairly straightforward to get the means in the plot we've just made using the emmeans() function from the emmeans package. Place the name of your afex model into the function and include a vector of the variable names of predictors.

r robot() Code example

To get the estimated marginal means for the [alcohol*facetype]{.alt} interaction we would execute:

emmeans::emmeans(goggles_afx, c("alcohol", "facetype"))

r alien() Alien coding challenge

Obtain the estimated marginal means for the [facetype*alcohol]{.alt} interaction.


emmeans::emmeans(goggles_afx, c("alcohol", "facetype"))

r user_visor() Fitting the model using lm() [(2)]{.alt}

You might have noticed that when you used aov_4() that a message told you that contrasts had been set using contr.sum(). Remember that typically we want Type III sums of squares and these require predictor variables to have independent (or to use the posh term [orthogonal]{.alt}) contrasts. Orthogonal means that the numbers used to code groups must sum to zero and cross multiple to sum to zero. By default, r_proj() will use dummy coding (0s and 1s) which results in non-orthogonal contrasts. The message is telling you that aov_4() has created orthogonal contrasts for you so you don't have to worry about it. In short, aov_4() makes it super easy to test the predictors and their interaction without needing to think too much. The price you pay is a lack of flexibility in the contrasts. Which leads us on to the gateway to hell that is using lm().

You can extend everything from [discovr_12]{.alt} to the situation with multiple categorical predictors. Here's how.

r user_visor() Using built in contrasts [(2)]{.alt}

Unlike when using aov_4(), when using lm() we do have to think about setting orthogonal contrasts. The simplest way to ensure orthogonal contrasts is to set them using contr.sum(n) replacing the n with the number of categories/groups (the sum means that contrasts sum to zero). This is what aov_4() did for us.

r robot() Code example

Using contr.sum(n) we could set the contrasts for the two predictors by executing (note that we replace n with the number of groups):

contrasts(goggles_tib$facetype) <- contr.sum(2)
contrasts(goggles_tib$alcohol) <- contr.sum(3)

Having done this we'd fit the model. We want to enter the following predictors: [facetype]{.alt}, [alcohol]{.alt} and the [facetype*alcohol]{.alt} interaction. We learnt how to do this in [discovr_10]{.alt}.

`r info()` **Specifying interactions** Remember that we can specify an interaction term within a model formula in `r rproj()` in two ways. Using the current variables of [facetype]{.alt} and [alcohol]{.alt}, the first is `facetype:alcohol`. Using this method we'd specify the model formula as: wzxhzdk:45 The second method uses a shorthand for adding all main effects and their interactions, which is `facetype*alcohol`. This code will introduce the main effect of [facetype]{.alt}, the main effect of [alcohol]{.alt} and their interaction. Using this method we'd specify the model formula as: wzxhzdk:46 The two methods for specifying the model formula are interchangeable.

r robot() Code example

Therefore, to fit the model we'd execute:

goggles_lm <- lm(attractiveness ~ facetype*alcohol, data = goggles_tib)

Then to get the Type III sums of squares like we did in [discovr_12]{.alt} we'd execute:

car::Anova(goggles_lm, type = 3)

If we wanted a robust model (using HC3 standard errors) we could specify this within Anova():

car::Anova(goggles_lm, type = 3, white.adjust = "hc3")

r alien() Alien coding challenge

Try fitting the model as described above.


contrasts(goggles_tib$facetype) <- contr.sum(2)
contrasts(goggles_tib$alcohol) <- contr.sum(3)
goggles_lm <- lm(attractiveness ~ facetype*alcohol, data = goggles_tib)
car::Anova(goggles_lm, type = 3) |>   # or car::Anova(goggle_lm, type = 3, white.adjust = "hc3")
  knitr::kable(digits = 3)

The results show significant main effects and interactions for all variables. The main effects are not interesting in the context of the significant interaction effect, so we'll focus on the significant interaction effect. We can use the means we computed earlier to start to unpick this interaction. We can also used the estimate_means() function from modelbased to obtain them, which we met in [discovr_12]{.alt}. We have no covariates, so we could execute:

modelbased::estimate_means(goggles_lm)

To get the means across all combinations of levels of the predictors in the model [goggles_lm]{.alt}.

r alien() Alien coding challenge

Use estimate_means() to obtain means across all combinations of levels of the predictors in the model [goggles_lm]{.alt}.

contrasts(goggles_tib$facetype) <- contr.sum(2)
contrasts(goggles_tib$alcohol) <- contr.sum(3)
goggles_lm <- lm(attractiveness ~ facetype*alcohol, data = goggles_tib)

modelbased::estimate_means(goggles_lm)
`r bug()` **De-bug** The output contains a message: wzxhzdk:56 This warning appears because we didn't specify the [at]{.alt} argument of `estimate_means()` function, which tells the function the variables for which we want means. In the absence of the [at]{.alt} argument, the function assumes we want the means across all combinations of predictors, which is in fact what we want so we can ignore this message. Nevertheless, the message is helpful in reminding us that our instruction has been interpreted as wzxhzdk:57

We can see that in the placebo group attractive stimuli are rated as more attractive than unattractive ones, this is also true (but to a lesser extent) in the low dose of alcohol group, but in the high dose of alcohol group the mean attractiveness ratings are similar for the different types of stimuli. In other words, at high doses of alcohol, attractive and unattractive faces are rated as similarly attractive but this isn't the case at lower doses of alcohol or for a placebo dose.

`r cat_space()` **Tip** An alternative to `contr.sum(n)` is `contr.helmert(n)`, which will set up a contrast that compares each group to the average of previous groups. For example, our [alcohol]{.alt|} variable has 3 levels: placebo, low dose and high dose, `contr.helmert(3)` for this variable results in: * Contrast 1: low vs. placebo * Contrast 2: high vs. average of low and placebo groups combined

r user_astronaut() Using manual contrasts [(3)]{.alt}

From the point of view of the F-statistics it doesn't matter whether you use contr.sum() or contr.helmert() but it will affect the parameter estimates (bs) and what they represent. You might therefore, want to manually set contrasts like we did in [discovr_12]{.alt}.

For the [alcohol]{.alt} variable, we might set contrasts similar to the ones in [discovr_12]{.alt}, by creating two dummy variables using the contrast coding in Table 1. Contrast 1 compares the placebo group to the two alcohol groups combined, and the second contrast compares the low and high alcohol groups. Check back to [discovr_12]{.alt} to understand these codes. (Incidentally, these codes produce a Helmert contrast so in this case you'd get the same results from using contr.helmert())

con_tbl <- tibble(
  `Group` = c("Placebo", "Low dose", "High dose"),
  `Dummy 1 (Placebo vs. alcohol)` = c("-2/3", "1/3", "1/3"),
  `Dummy 2 (Low vs. High)` = c("0", "-1/2", "1/2")
  )

knitr::kable(con_tbl, caption = "Table 1: Contrast coding for the alcohol variable")

For the [facetype]{.alt} variable we could simply use a contrast that compares the attractive to unattractive stimuli. If we follow the rules that we learnt about contrast coding we'd:

r robot() Code example

We can set all of these contrast using the following code:

alcohol_vs_none <- c(-2/3, 1/3, 1/3)
low_vs_high <- c(0, -1/2, 1/2)
contrasts(goggles_tib$alcohol) <- cbind(alcohol_vs_none, low_vs_high)
contrasts(goggles_tib$facetype) <- c(-1/2, 1/2)

The first three lines set the contrasts for the variable [alcohol]{.alt} and the last line sets the contrast for [facetype]{.alt}.

Having set the contrasts, we could fit the model using the same code as before.

r alien() Alien coding challenge

Try fitting the model as described above.


alcohol_vs_none <- c(-2/3, 1/3, 1/3)
low_vs_high <- c(0, -1/2, 1/2)
contrasts(goggles_tib$alcohol) <- cbind(alcohol_vs_none, low_vs_high)
contrasts(goggles_tib$facetype) <- c(-1/2, 1/2)
goggles_lm <- lm(attractiveness ~ facetype*alcohol, data = goggles_tib)
car::Anova(goggles_lm, type = 3) |> 
  knitr::kable(digits = 3)

The overall results based on the F-statistic will be identical to before, however, because we have set up meaningful contrasts we can use the parameter estimates to interpret the interaction.

r alien() Alien coding challenge

Use what you know already to inspect the parameter estimates of the model [goggles_lm]{.alt}, which you have just created.

alcohol_vs_none <- c(-2/3, 1/3, 1/3)
low_vs_high <- c(0, -1/2, 1/2)
contrasts(goggles_tib$alcohol) <- cbind(alcohol_vs_none, low_vs_high)
contrasts(goggles_tib$facetype) <- c(-1/2, 1/2)
goggles_lm <- lm(attractiveness ~ facetype*alcohol, data = goggles_tib)
goggles_afx <- afex::aov_4(attractiveness ~ facetype*alcohol + (1|id), data = goggles_tib)

# Use tidy() from the broom package
broom::tidy(goggles_lm) |> 
  knitr::kable(digits = 3)
alcohol_vs_none <- c(-2/3, 1/3, 1/3)
low_vs_high <- c(0, -1/2, 1/2)
contrasts(goggles_tib$alcohol) <- cbind(alcohol_vs_none, low_vs_high)
contrasts(goggles_tib$facetype) <- c(-1/2, 1/2)
goggles_par <- lm(attractiveness ~ facetype*alcohol, data = goggles_tib) |> broom::tidy(conf.int = T)

There are two key effects here:

To sum up, the significant interaction is being driven by alcohol consumption (any dose compared to placebo, and high dose compared to low) affecting ratings of unattractive face stimuli significantly more than it affects ratings of attractive face stimuli.

`r pencil()` **Report**`r rproj()` There was a significant effects of the type of face used, `r report_afx(gog_afx_tbl, row = 1)`, and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 2)`. However, these effects were superseded by a significant interaction between the type of face being rated and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 3)`. Contrasts suggested that the difference between ratings of symmetric and asymmetric faces was significantly smaller after any dose of alcohol compared to no alcohol, $\hat{b}$ = `r report_pars(goggles_par, row = 5)`, and became smaller still when comparing a high- to a low-dose of alcohol, $\hat{b}$ = `r report_pars(goggles_par, row = 6)`. These effects support the 'beer-googles' hypothesis: when no alcohol is consumed symmetric faces were rated as more attractive than asymmetric faces but this difference diminishes as more alcohol is consumed.

r user_visor() Simple effects analysis [(2)]{.alt}

Regardless of whether you fit the model with lm() or aov_4(), a particularly effective way to break down interactions is [simple effects analysis]{.alt}, which looks at the effect of one predictor at individual levels of another. For example, we could do a simple effects analysis looking at the effect of type of face at each level of alcohol. This would mean taking the average attractiveness rating of unattractive faces and comparing it to that for attractive faces after a placebo drink, then making the same comparison after a low dose of alcohol, and then finally for a high dose. By doing so we ask: what is the effect of [facetype]{.alt} within each alcohol group?

An alternative is to quantify the effect of [alcohol]{.alt} (the pattern of means across the placebo, low dose and high dose) separately for unattractive and attractive faces.

We can do this analysis using the joint_tests() function from emmeans. You place your model ([goggles_afx]{.alt} or [goggles_lm]{.alt}) into the function and specify the predictor for which you want an analysis at each level.

r robot() Code example

For example, if we want to look at the effect of [facetype]{.alt} in each level of alcohol, we'd execute:

emmeans::joint_tests(goggles_afx, "alcohol")

for the model created with aov_4() called [goggles_afx]{.alt}, and

emmeans::joint_tests(goggles_lm, "alcohol")

for the model created with lm() called [goggles_lm]{.alt}.

If we wanted to look at the effect of [alcohol]{.alt} separately for attractive and unattractive stimuli, we'd execute:

emmeans::joint_tests(goggles_afx, "facetype")

for the model created with aov_4() called [goggles_afx]{.alt}, and

emmeans::joint_tests(goggles_lm, "facetype")

for the model created with lm() called [goggles_lm]{.alt}.

r alien() Alien coding challenge

Do a simple effects analysis to look at the effect of [alcohol]{.alt} separately for attractive and unattractive stimuli, use kable() to round values.


# replace the xs
emmeans::joint_tests(xxxxxx, "xxxxxx")
# for the model created with afex
emmeans::joint_tests(goggles_afx, "facetype") |> 
  knitr::kable(digits = 3)
# for the model created with lm
emmeans::joint_tests(goggles_lm, "facetype") |> 
  knitr::kable(digits = 3)
quiz(caption = "Interpreting simple effects 1 (level 2)",
  question("Using the output above, which of the following best describes the results from the simple effects analysis?",
    answer("There was a significant effect of alcohol on attractiveness ratings for unattractive faces, but not attractive ones.", correct = T),
    answer("There was a non-significant effect of alcohol on attractiveness ratings for unattractive faces, but a significant effect for attractive ones.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    answer("There was a non-significant effect of alcohol on attractiveness ratings for both unattractive and attractive faces.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    answer("There was a significant effect of alcohol on attractiveness ratings for both unattractive and attractive faces.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )
)

r alien() Alien coding challenge

Let's try the simple effects analysis the other way around: obtain the simple effect of [facetype]{.alt} separately for each dose of [alcohol]{.alt}. Although there is no 'correct' way round to conduct the simple effects, this way makes the most sense (to me) in this example because it allows us to see at each dose of alcohol whether there is a significant difference in ratings of the two types of faces.


# replace the xs
emmeans::joint_tests(xxxxxx, "xxxxxx")
# for the model created with afex
emmeans::joint_tests(goggles_afx, "alcohol") |> 
  knitr::kable(digits = 3)
# for the model created with lm
emmeans::joint_tests(goggles_lm, "alcohol") |> 
  knitr::kable(digits = 3)
quiz(caption = "Interpreting simple effects 2 (level 2)",
  question("Using the output above, which of the following best describes the results from the simple effects analysis?",
    answer("There was a significant difference in the ratings of attractive and unattractive faces in the placebo group and the low dose group, but a non-significance difference in the high dose group.", correct = T),
    answer("There was a significant difference in the ratings of attractive and unattractive faces in the placebo group, but a non-significance difference in the low and high dose groups.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    answer("There was a significant difference in the ratings of attractive and unattractive faces in the placebo group and the high dose group, but a non-significance difference in the low dose group.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    answer("There was a significant difference in the ratings of attractive and unattractive faces in all of the groups.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    answer("There was a non-significant difference in the ratings of attractive and unattractive faces in all of the groups.", message = "Sorry, but this answer is incorrect. Remember that using the conventional 0.05 level of significance, an effect is significant if its *p*-value is *less than* 0.05."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )
)
gog_se <- emmeans::joint_tests(goggles_afx, "alcohol")
`r pencil()` **Report**`r rproj()` There was a significant effects of the type of face used, `r report_afx(gog_afx_tbl, row = 1)`, and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 2)`. However, these effects were superseded by a significant interaction between the type of face being rated and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 3)`. Simple effects analysis revealed that symmetric faces were rated as significant more attractive than asymmetric faces after no alcohol, `r report_se(gog_se, row = 1)`, and a low dose, `r report_se(gog_se, row = 2)`, but were rated comparably after a high dose of alcohol, `r report_se(gog_se, row = 3)`. These effects support the 'beer-googles' hypothesis: the standard tendency to rate symmetric faces as more attractive than asymmetric faces was present at low doses and no alcohol, but was eliminated by a high dose of alcohol.

r bmu() Diagnostic plots [(1)]{.alt}

As with any linear model created with lm(), we can use the plot() function to produce diagnostic plots from the model. We cannot use this with models created by afex.

`r info()` **Diagnostic plots** Remember that `plot()` takes this general form: wzxhzdk:80 You can also use `ggplot2::autoplot()` to make pretty versions of the plot. To use this function outside of the tutorial remember to execute `library(ggfortify)`

r alien() Alien coding challenge

Obtain plots 1, 3, 2 and 4 (in that order) for the model [goggles_lm]{.alt}.


plot(goggles_lm, which = c(1, 3, 2, 4))

# or to get  a nicely formatted plots
# library(ggfortify)  # outside of this tutorial you'll need this

ggplot2::autoplot(goggles_lm,
                  which = c(1, 3, 2, 4),
                  colour = "#5c97bf",
                  smooth.colour = "#ef4836",
                  alpha = 0.5,
                  size = 1) + 
  theme_minimal()
quiz(caption = "Diagnostic plot quiz (level 2)",
  question("How would you interpret the *Residual vs. fitted* and *Scale-location* plots?",
    answer("Were in trouble: I see heteroscedasticity.", message = "Unlucky. The red line on the scale-location plot is fairly flat and the verstical lines of dots seem similar in length as we move along the *x*-axis indicating homoscedasticity."),
    answer("I'm not sure, give me a hint.", message = "Heteroscedasticity is shown up by a red line that isn't flat and a vertical spread of dots that changes as you move along the *x*-axis."),
    answer("Everything is fine - residuals show homogeneity.", correct = TRUE, message = "Yes, the red line is fiarly flat and the vertical spread of dots is similar as you move along the *x*-axis."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Based on the Q-Q plot, can we assume normality of the residuals?",
    answer("Yes", correct = TRUE, message = "The dots on the Q-Q plot only deviate slightly from the line at the extremes, which probably indicates a roughly normal distribution."),
    answer("No", message = "The dots on the Q-Q plot only deviate slightly from the line at the extremes, which probably indicates a roughly normal distribution."),
    answer("Give me a clue", message = "If residuals are normally distributed then the dots on the Q-Q plot should cling lovingly to the diagonal line."),
    correct = "Correct - Well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Based on the plot of Cook's distance, are there any influential cases?",
    answer("Yes", message = "The largest Cook's distance needs to be above about 1 before we'd worry."),
    answer("No", correct = TRUE),
    answer("Maybe", message = "Sorry, you're not allowed to sit on the fence!"),
    correct = "Correct - The largest Cook's distance is about 0.2 which is well below the threshold of 1 at which we'd worry. well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )
)

r user_visor() Robust models [(2)]{.alt}

As for previous linear models(e.g., in discovr_08, discovr_09 and discovr_11), we can get robust parameter estimates using robust::lmRob() and robust tests of these parameters using parameters::model_parameters(). These methods won't work for models created with afex.

r alien() Alien coding challenge

Remember that we use lmRob in exactly the same way as lm(). Use this function to fit a robust version of the [goggles_lm]{.alt} model. Call the model [goggles_rob]{.alt} and remember to get the summary statistics using summary().


goggles_rob <- robust::lmRob(attractiveness ~ facetype*alcohol, data = goggles_tib)
summary(goggles_rob)

The bottom of the output shows significance tests of bias. These tests suggest that bias in the original model is not problematic (because the p-value for these tests are not significant - in fact, they are 1, completely not significant). The robust parameter estimates for the interaction terms ([facetype1:alcoholalcohol_vs_none]{.alt} and [facetype1 : alcohollow_vs_high ]{.alt}) have got smaller but are still both significant, so the profile of results doesn't change when robust parameter estimates are used.

r alien() Alien coding challenge

Remember from previous tutorials that to get a summary of an existing model like [goggles_lm]{.alt} that uses heteroscedasticity-consistent standard errors (i.e. robust significance tests and confidence intervals), we put the model into model_parameters() and set [vcov = \"HC4\"]{.alt}. Try this in the code box:


parameters::model_parameters(goggles_lm, vcov = "HC4") |> 
  knitr::kable(digits = 3) 

When we fit the model with heteroskedasticity-consistent standard errors the parameter estimates will match the non-robust model but the standard errors, p-values and confidence intervals change because these are based on methods robust to heteroscedasticity (the HC4 estimates that we asked for). For the two terms that represent the interaction term ([facetype1:alcoholalcohol_vs_none]{.alt} and [facetype1 : alcohollow_vs_high ]{.alt}) the profile of results is unchanged by using robust standard errors, both terms are significant and have 95% confidence intervals that do not contain 0. The fact that the profile of results is unchanged is not surprising given that the model plots suggested that homoscedasticity could be assumed.

Given the small sample size, we might also consider a bootstrap model of the parameter estimates and their confidence intervals and significance tests. We can obtain these using the bootstrap_parameters() function from parameters, which takes the general form:

parameters::bootstrap_parameters(my_model)

In which we replace [my_model]{.alt} with the name of the object containing the nonrobust model (in this case [goggles_lm]{.alt})

r alien() Alien coding challenge

Bootstrap the model :


parameters::bootstrap_parameters(goggles_lm) |> 
  knitr::kable(digits = 3)

The estimates themselves are quite similar to those from the non-robust model and both terms for the interaction ([facetype1:alcoholalcohol_vs_none]{.alt} and [facetype1 : alcohollow_vs_high ]{.alt}) are again significant.

r user_astronaut() Effect sizes [(3)]{.alt}

We can interpret the bs from the model as raw effect sizes (and there's a lot to be said for doing that). However, in previous tutorials we have seen that we can obtain effect sizes for the overall effect of a predictor (i.e. effect sizes that relate to the F-statistics for each predictor).

r robot() Code example

Specifically, we can use the eta_squared() and omega_squared() functions from the effectsize package [@effectsize2020; @R-effectsize], which take the general form:

effectsize::eta_squared(anova_object, partial = TRUE, ci = 0.9)
effectsize::omega_squared(anova_object, partial = TRUE, ci = 0.9)

All we do is put the object into the function (or pipe it in). By default you'll get partial eta-squared ($\eta^2_p$) and partial omega-squared ($\omega^2_p$), but you can get the non-partial version by setting [partial = FALSE]{.alt}, and you'll get a 90% confidence interval, which you might want to change to some other value.

r alien() Alien coding challenge

The function uses the sums of squares from the object that is passed into it, so its safest to re-use our earlier code with car::Anova() where we set the sums of squares and pipe it into the function. Try this below.

car::Anova(goggles_lm, type = 3) |> 
  effectsize::eta_squared(ci = 0.95) |> 
  knitr::kable(digits = 3)

If you fitted the model with afex you can pipe the model [goggles_afx]{.adj} directly into the function:

goggles_afx |> 
  effectsize::eta_squared(ci = 0.95) |> 
  knitr::kable(digits = 3)

These values show that the $\text{facetype} \times \text{alcohol}$ interaction (the effect we care about) explains 29% of the variance in attractiveness not attributable to other predictors, which is sizeable. As we have seen before, it's typically better to use partial omega squared ($\omega^2_p$), which is an unbiased version of $\eta^2_p$.

r alien() Alien coding challenge

Use the code example to compute partial omega squared for the predictors in either [goggles_lm]{.alt} or [goggles_afx]{.alt} depending on which you fitted earlier.


# solution for goggles_lm:
car::Anova(goggles_lm, type = 3) |> 
  effectsize::omega_squared(ci = 0.95) |> 
  knitr::kable(digits = 3)
# solution for goggles_afx:
goggles_afx |> 
  effectsize::omega_squared(ci = 0.95) |> 
  knitr::kable(digits = 3)
gog_os <- goggles_afx |> 
  effectsize::omega_squared(ci = 0.95)

The effect sizes are slightly smaller than (as we'd expect) using omega-squared. The interaction effect now explains about 25% of variation in attractiveness ratings.

`r pencil()` **Report**`r rproj()` There was a significant effects of the type of face used, `r report_afx(gog_afx_tbl, row = 1)`, `r report_es(gog_os, col = "Omega2_partial", row = 1)`, and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 2)`, `r report_es(gog_os, col = "Omega2_partial", row = 2)`. However, these effects were superseded by a significant interaction between the type of face being rated and the dose of alcohol, `r report_afx(gog_afx_tbl, row = 3)`, `r report_es(gog_os, col = "Omega2_partial", row = 3)`. This interaction suggests that the effect of alcohol is moderated by the type of face being rated (and vice versa). Based on the means (see plot) this interaction supports the 'beer-googles' hypothesis: when no alcohol is consumed symmetric faces were rated as more attractive than asymmetric faces but this difference diminishes as more alcohol is consumed.

r user_astronaut() Bayes factors [(3)]{.alt}

Like in previous tutorials (discovr_08, discovr_09, discovr_11, discovr_12) we can use the BayesFactor package [@morey_bayesfactor_2018]. For factorial designs we use the lmBF() function.

r robot() Code example

We saw in discovr_12 that the lmBF() function has this format:

my_model <- BayesFactor::lmBF(formula = outcome ~ predictor,
                              data = my_tib,
                              rscaleFixed = "medium",
                              rscaleCont = "medium")

We also saw that lmBF() uses default priors for categorical variables ([rscaleFixed]{.alt}) that can be specified as a number or as "medium" (the default), "wide", and "ultrawide". These labels correspond to r scale values of 1/2, $^\sqrt{2}/_2$, and 1. We could, therefore, obtain a Bayes factor for the entire model with the following code:

goggles_bf <-  BayesFactor::lmBF(formula = attractiveness ~ alcohol*facetype,
                                 data = goggles_tib,
                                 rscaleFixed = "medium")

However, we are more interested in quantifying the individual effects than the model overall, and in particular the interaction effect. Therefore, we're going to build three models: (1) alcohol as the only predictor; (2) a model that adds facetype as a predictor; (3) a model that adds the interaction term (alcohol:facetype). Having created these models, we'll compare them. So, we'll start by looking at the Bayes factor for alcohol as the sole predictor, then get the Bayes factor for what facetype adds to the model, then finally get the Bayes factor for what the interaction term adds to the model.

r robot() Code example

To create these models we'd use

alcohol_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol, data = goggles_tib)

facetype_bf <-  BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype, data = goggles_tib)

int_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype + alcohol:facetype, data = goggles_tib)

The first line creates the model with only alcohol as a predictor and stores it in the object [alcohol_bf]{.alt}, the second line adds facetype as a predictor and stores it in the object [facetype_bf]{.alt} and the third line adds the interaction term to the model and stores it in the object [int_bf]{.alt}. Having created the models we can compare them using this code:

alcohol_bf
facetype_bf/alcohol_bf
int_bf/facetype_bf

The first line gives us the Bayes factor for the model with only alcohol as a predictor, the second shows us the Bayes factor for the model with facetype added as a predictor relative to the model that has only alcohol as a predictor. In other words, it tells us what facetype adds to the model or, put another way, it quantifies the effect of facetype adjusting for alcohol. The third line shows us the Bayes factor for the model including all main effects and the interaction term relative to the model with only the main effects. This tells us what the alcohol:facetype interaction adds to the model above and beyond the main effects. Put another way, it's the Bayes factor for the interaction term, which is what we're interested in.

r alien() Alien coding challenge

Use the sample code to obtain BayesFactors for the model containing alcohol only, the model containing alcohol and facetype, and the model containing alcohol, facetype and their interaction. Compare each model to the previous one. Use default medium priors.


## Create the object that contains the Bayes factor for the model containing **alcohol**:

alcohol_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol, data = goggles_tib)
## Create the object that contains the Bayes factor for the model containing **alcohol** and **facetype**

facetype_bf <-  BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype, data = goggles_tib)
## Create the object that contains the Bayes factor for the model containing **alcohol**, **facetype** and their interaction:

int_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype + alcohol:facetype, data = goggles_tib)
## View the Bayes factor for each model compared to the previous one:

alcohol_bf
facetype_bf/alcohol_bf
int_bf/facetype_bf
## Full solution:
alcohol_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol, data = goggles_tib)

facetype_bf <-  BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype, data = goggles_tib)

int_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype + alcohol:facetype, data = goggles_tib)

alcohol_bf
facetype_bf/alcohol_bf
int_bf/facetype_bf
m1 <- BayesFactor::lmBF(formula = attractiveness ~ alcohol, data = data.frame(goggles_tib))
facetype_bf <-  BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype, data = data.frame(goggles_tib))
int_bf <- BayesFactor::lmBF(formula = attractiveness ~ alcohol + facetype + alcohol:facetype, data = data.frame(goggles_tib))

m2 <- facetype_bf/m1
m3 <- int_bf/facetype_bf
`r cat_space()` **Tip** The Bayes factors in [facetype_bf]{.alt} and [int_bf]{.alt} are computed using a sampling process and so will change each time you run the code. It also means the values I report will differ from those you obtain, which is confusing but nothing to worry about.

Looking at the first Bayes factor, the data are r get_bf(m1) times more likely under the alternative hypothesis (attractiveness is predicted from the dose of alcohol) than under the null (the dose of alcohol does not predict attractiveness ratings). Our beliefs that the dose of alcohol affects attractiveness ratings should increase by a factor of about r get_bf(m1) – in other words it should move away from the null. This value is fairly weak evidence, but then again we're not interested in this effect because it collapses across the type of face being rated.

Looking at the second Bayes factor, the data are r get_bf(m2) times more likely under the model that predicts attractiveness ratings from the type of face and dose of alcohol than under the model that predicts attractiveness from the dose of alcohol alone. In other words, our beliefs that the type of face being rated affects ratings of attractiveness should shift away from the null by a factor of r get_bf(m2) (a substantial change and strong evidence).

Looking at the final Bayes factor, the data are r get_bf(m3) times more likely under the model that predicts attractiveness ratings from the combined effect of the type of face and the dose of alcohol than under the model that predicts attractiveness from the main effects of dose of alcohol and type of face. In other words, our beliefs that the type of face moderates the effect of alcohol on the ratings of attractiveness should shift away from the null by a factor of r get_bf(m3) (a substantial change and strong evidence).

r user_visor() Transfer task [(2)]{.alt}

Blow up your video

Let's 's look at a second example from [@field_discovering_2023], which is about injuries from video games:

A researcher was interested in what factors contributed to injuries resulting from game console use. She tested 40 participants who were randomly assigned to either an active or static game played on either a Nintendo Switch or Xbox One Kinect. At the end of the session their physical condition was evaluated on an injury severity scale ranging from 0 (no injury) to 20 (severe injury). The data are in [xbox_tib]{.alt}, which contains the variables [id]{.alt}, [game]{.alt} (static, active), [console]{.alt} (Switch, Xbox One), and [injury]{.alt} (a score from 0 to 20). Fit a model to see whether injury severity is significantly predicted from the type of game, the type of console and their interaction. You can use either lm() or aov_4().

`r info()` **Information** There is a detailed solution to this task at [https://www.discovr.rocks/solutions/alex/alex_13/#task-1310](https://www.discovr.rocks/solutions/alex/alex_12/#task-1310). [Materials not yet live]
quiz(caption= "Setting contrasts (level 3)",
     question("What contrast codes for the variables **game** and **console** are orthogonal and valid [Select ALL valid answers]",
    answer("**game**: static (-1/2), active (1/2); **console**: Switch (-1/2), Xbox One (1/2)", correct = T, message = "It would also be valid to reverse the signs and use **game**: static (1/2), active (-1/2); **console**: Switch (1/2), Xbox One (-1/2)."),
    answer("**game**: static (0), active (1); **console**: Switch (0), Xbox One (1)", message = "You have used dummy coding, these contrasts are not orthogonal."),
    answer("**game**: static (1), active (-1); **console**: Switch (1), Xbox One (-1)", correct = T, message = "These contrasts are orthogonal and valid, but I wouldn't use them because the bs won't represent the exact differences between means (but will be proportional to them)."),
    answer("**game**: static (1), active (2); **console**: Switch (-2), Xbox One (1)", message = "These contrasts are not orthogonal."),
    type = "learnr_checkbox",
    correct = "Correct - well done!",
    incorrect = "Good try.",
    random_answer_order = TRUE,
    allow_retry = T
)
)

r alien() Alien coding challenge

If you're going to answer the question using lm(), use the code box to set the contrasts for [game]{.alt} and [console]{.alt}.


# Start by defining the contrasts:

static_vs_active <- c(-1/2, 1/2)
switch_vs_xbox <- c(-1/2, 1/2)

# Now assign these variables to the appropriate predictors (replace the xxxs)
# Assign these variables to the appropriate predictors (replace the xxxs)

contrasts(xxxxx) <- static_vs_active
contrasts(xxxxx) <- switch_vs_xbox
active_vs_static <- c(-1/2, 1/2)
switch_vs_xbox <- c(-1/2, 1/2)
contrasts(xbox_tib$game) <- active_vs_static
contrasts(xbox_tib$console) <- switch_vs_xbox

# To view the contrasts:
contrasts(xbox_tib$game)
contrasts(xbox_tib$console)

r alien() Alien coding challenge

Use the code box to fit and look at the main model. Call the model [xbox_lm]{.alt} if you use lm() or [xbox_afx]{.alt} if you use aov_4().

active_vs_static <- c(-1/2, 1/2)
switch_vs_xbox <- c(-1/2, 1/2)
contrasts(xbox_tib$game) <- active_vs_static
contrasts(xbox_tib$console) <- switch_vs_xbox

# Fill in the xs.
# using lm():
xbox_lm <- lm(xxx ~ xxx*xxx, data = xxxx)
car::Anova(xxxx, type = 3)
# using aov_4():
xbox_afx <- afex::aov_4(xxx ~ xxx*xxx + + (1|xxxx), data = xxxx)
xbox_afx
# The main model will be:
xbox_lm <- lm(injury ~ game*console, data = xbox_tib)
car::Anova(xbox_lm, type = 3)
# using aov_4():
xbox_afx <- afex::aov_4(injury ~ game*console + (1|id), data = xbox_tib)
xbox_afx
# Use tidy() to look at the parameter estimates
# Use tidy() to look at the parameter estimates (lm() only):
broom::tidy(xbox_lm, conf.int = TRUE)
# Full solution:
xbox_lm <- lm(injury ~ game*console, data = xbox_tib)
car::Anova(xbox_lm, type = 3)
broom::tidy(xbox_lm, conf.int = TRUE)
# using aov_4():
xbox_afx <- afex::aov_4(injury ~ game*console + (1|id), data = xbox_tib)
xbox_afx

r alien() Alien coding challenge

Use the code box to get the mean injury score in all combinations of the type of game and the type of console.

active_vs_static <- c(-1/2, 1/2)
switch_vs_xbox <- c(-1/2, 1/2)
contrasts(xbox_tib$game) <- active_vs_static
contrasts(xbox_tib$console) <- switch_vs_xbox
xbox_lm <- lm(injury ~ game*console, data = xbox_tib)
xbox_afx <- afex::aov_4(injury ~ game*console + (1|id), data = xbox_tib)

# Fill in the xs.
# using lm():
modelbased::xxxxxxx(xxxxxx)
# using aov_4():
emmeans::emmeans(xbox_afx, c("game", "console"))
# The main model will be:
modelbased::estimate_means(xbox_lm)
# using aov_4():
emmeans::emmeans(xbox_afx, c("game", "console"))
# Use tidy() to look at the parameter estimates

r alien() Alien coding challenge

Use the code box to get the simple effect of the type of game for each type of console.


# Fill in the xs.
emmeans::joint_tests(xxxx, "xxxx")
# using lm():
emmeans::joint_tests(xbox_lm, "console") |> 
  knitr::kable(digits = 3)
# using aov_4():
emmeans::joint_tests(xbox_afx, "console") |> 
  knitr::kable(digits = 3)
quiz(caption = "Interpreting interactions (level 2)",
     question("Generally speaking, how would you interpret the interaction between the type of game and the type of console?",
    answer("There was a significant interaction between the type of game played and the type of console on injury severity. This means that the effect of the type of game on injury was moderated by the type of console used.", correct = T, message = "Looking at the main table, we can see that the interaction effect was significant, *F*(1, 36) = 5.05, *p* = .031."),
    answer("There was a non-significant interaction between the type of game played and the type of console on injury severity. This means that the effect of the type of game on injury was not moderated by the type of console used.", message = "Sorry, that's not correct. The *p*-value for the effect is 0.031, which would typically be interpreted as *significant* because it is less than 0.05."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Having decomposed the interaction term with emmeans and simple effects, how would you interpret the interaction between the type of game and the type of console?",
    answer("The effect of the type of game was significantly different between the Switch and Xbox: injury severity was not significantly different for active and static games when using the Xbox, whereas for the Switch injuries were more severe for active games compared to passive ones.", correct = T),
    answer("The effect of the type of game was significantly different between the Switch and Xbox: injury severity was significantly different for active and static games in both the Switch and Xbox.", message = "Sorry, that's not correct. Although it sort of nearly is. In fact the simple effect for the Xbox had p = 0.0523, which is hovering on the threshold and sort of illustrates how p can force you into quite arbitrary conclusions. One thing we can say is that the effect of active games compared to static ones on injuries is significantly stronger for Switch games compared to xbox games."),
    answer("The effect of the type of game was not significantly different between the Switch and Xbox: injury severity was not significantly different for active and static games in both the Switch and Xbox.", message = "The interaction effect was significant, not non-significant."),
    answer("The effect of the type of game was significantly different between the Switch and Xbox: injury severity was significantly higher for Xbox games compared to Switch games.", message = "Sorry, that's not correct. First, the explanation describes the main effect of console (not the interaction) and injury severity was actually worse for the Switch."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )
)

r alien() Alien coding challenge

If you used lm() to fit the model, use the code box to get diagnostic plots for [xbox_lm]{.alt}.


plot(xbox_lm, which = c(1, 3, 2, 4))

# Or for pretty plots

ggplot2::autoplot(xbox_lm,
                  which = c(1, 3, 2, 4),
                  colour = "#5c97bf",
                  smooth.colour = "#ef4836",
                  alpha = 0.5,
                  size = 1) + 
  theme_minimal()
quiz(caption = "Diagnostic plot quiz (level 2)",
  question("How would you interpret the *Residual vs. fitted* and *Scale-location* plots?",
    answer("Were in trouble: I see heteroscedasticity.", message = "Unlucky. The red line on the scale-location plot is fairly flat and the verstical lines of dots seem similar in length as we move along the *x*-axis indicating homoscedasticity."),
    answer("I'm not sure, give me a hint.", message = "Heteroscedasticity is shown up by a red line that isn't flat and a vertical spread of dots that changes as you move along the *x*-axis."),
    answer("Everything is fine - residuals show homogeneity.", correct = TRUE, message = "Yes, the red line is fiarly flat and the vertical spread of dots is similar as you move along the *x*-axis."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Based on the Q-Q plot, can we assume normality of the residuals?",
    answer("Yes", correct = TRUE, message = "The dots on the Q-Q plot don't seem to deviate from the line, which indicates a normal distribution."),
    answer("No", message = "If residuals are normally distributed then the dots on the Q-Q plot should cling lovingly to the diagonal line."),
    answer("Give me a clue", message = "If residuals are normally distributed then the dots on the Q-Q plot should cling lovingly to the diagonal line."),
    correct = "Correct - Well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Based on the plot of Cook's distance, are there any influential cases?",
    answer("Yes", message = "The largest Cook's distance needs to be above about 1 before we'd worry."),
    answer("No", correct = TRUE),
    answer("Maybe", message = "Sorry, you're not allowed to sit on the fence!"),
    correct = "Correct - The largest Cook's distance is about 0.4 which is well below the threshold of 1 at which we'd worry. well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )
)

r alien() Alien coding challenge

If you used lm() to fit the model, use the code box to get tests for the parameter estimates of [xbox_lm]{.alt} that use heteroscedasticity-consistent standard errors (HC4).


parameters::model_parameters(xbox_lm, vcov = "HC4") |> 
  knitr::kable(digits = 3)
quiz(caption = "Robust model quiz (level 3)",
  question("How would you interpret the robust contrast (labelled **game1 : console1**)?",
    answer("Using robust standard errors, it is still the case that the effect of active compared to static games on injury severity is still significant at 0.05 level of significance.", correct = TRUE),
    answer("Using robust standard errors, the effect of active compared to static games on injury severity is not significant at 0.05 level of significance.", message = "Remember that if the significance value is less than 0.05 that will typically be seen as *significant*."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  ),
  question("Which of the following statements about the value of 3.8 for the parameter estimate of the contrast labelled **game1 : console1** is true?",
    answer("It is the difference between the mean injury severity for static games subtracted from the mean for active games for the Xbox, subtracted from the same difference for the Switch", correct = TRUE, message = "Looking at the means, we can see that this value comes from (mean active Switch - mean static Switch) - (mean active Xbox - mean static Xbox) = (12.9 - 6.7) - (9.4 - 7) = 6.2 - 2.4 = 3.8. Phew!"),
    answer("It is the difference between the mean injury severity for active games subtracted from the mean for static games for the Xbox, subtracted from the same difference for the Switch.", message = "Nearly! Remember that at the start of the tutorial we coded static games as the reference category."),
    answer("It is the difference between the mean injury severity for static games subtracted from the mean for active games for the Switch, subtracted from the same difference for the Xbox", message = "Nearly! Remember that at the start of the tutorial we coded the Xbox as the reference category, not the Switch."),
    answer("It is the difference between the mean injury severity for active games subtracted from the mean for static games for the Switch, subtracted from the same difference for the Xbox", message = "Nearly! Remember that at the start of the tutorial we coded the Xbox as the reference category, not the Switch, and the static confdition as the reference not the active."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )
)
discovr package hex sticker, female space pirate with gun. Gunsmoke forms the letter R. **A message from Mae Jemstone:** Many species, human, non-human or extra-terrestrial, thrive on social interactions. They thrive less on statistical interactions, and yet these make it possible to look at whether the effects of one variable on another are moderated by a third variable. I remember when I was gathering crew for my mission to the *Trench of Weltschmerz*, whose inhabitants, the Schmertleflergs, live in a perpetual state of universe weariness. We planned to test a theory that their weariness was created by particles of xanath in the atmosphere, but I needed my crew to remain unaffected by it because otherwise they'd become too disillusioned to finish the mission. So, we exposed some of them to xanath and some were not, and some wore special breathing filters and others wore regular air filters, and we measured their negativity. We established that in those exposed to xanath, only those with the regular air filters become more negative. There was an interaction effect! So, we stocked up on the new breathing filters, and set off to spread some love across the trench! Now you too can investigate these kinds of relationships between predictors. Great work cadets!

Resources {data-progressive=FALSE}

Statistics

r rproj()

Acknowledgement

I'm extremely grateful to Allison Horst for her very informative blog post on styling learnr tutorials with CSS and also for sending me a CSS template file and allowing me to adapt it. Without Allison, these tutorials would look a lot worse (but she can't be blamed for my colour scheme).

References



profandyfield/discovr documentation built on May 4, 2024, 4:32 p.m.