By default, LexOPS will split by a single variable for each use of split_by(), and will create items for each factorial cell. For instance, splitting by arousal into 2 levels, and emotional valence into 3 levels, would result in 6 factorial cells. But what if we want to generate items for just 2 of these 6 factorial cells? We can do this by creating a factor/character vector column in our data which will represent suitability for each factorial cell. This vignette provides an example, where we want to compare high arousal, negative emotional words to low arousal neutral words.

Packages

library(dplyr)
library(tidyr)
library(ggplot2)
library(forcats)
library(LexOPS)
set.seed(1)
theme_set(theme_minimal())

Coding the New Column

We've decided we want our stimuli to have two conditions: high arousal, negative, and low arousal, neutral, according to Warriner et al. (2013).

Both arousal and valence are on 9-point Likert scales, so let's imagine we decide on the following cut-offs:

Firstly we create the column that will contain the information about our conditions. An easy way to do this might be with dplyr's case_when() function. We will call the new column, emo_cond, because I'm unimaginative.

dat <- lexops |>
  mutate(emo_cond = case_when(
    AROU.Warriner >= 6 & VAL.Warriner <= 3 ~ "arou_neg",
    AROU.Warriner <= 3 & between(VAL.Warriner, 4, 6) ~ "neutral",
    TRUE ~ "none"
  ))

Now let's check our conditions' locations on the distributions of arousal and valence ratings.

dat |>
  select(string, AROU.Warriner, VAL.Warriner, emo_cond) |>
  pivot_longer(cols = c(AROU.Warriner, VAL.Warriner), names_to = "Variable", values_to = "Value") |>
  mutate(emo_cond = fct_infreq(as.factor(emo_cond))) |>
  ggplot(aes(Value, fill = emo_cond)) +
  geom_histogram(binwidth = 0.5) +
  facet_wrap(vars(Variable)) +
  scale_fill_manual(values = c("#999999", "#E69F00", "#56B4E9"))

We can also visualise the locations of our conditions in this 2D space.

dat |>
  mutate(emo_cond = fct_infreq(as.factor(emo_cond))) |>
  ggplot(aes(AROU.Warriner, VAL.Warriner, colour = emo_cond)) +
  geom_point() +
  scale_colour_manual(values = c("#999999", "#E69F00", "#56B4E9"))

Generate Stimuli

Let's imagine we decide those cut-offs are sensible. We can now generate matched stimuli, for only these two factorial cells.

stim <- dat |>
  split_by(emo_cond, "arou_neg" ~ "neutral") |>
  control_for(Length) |>
  control_for(Zipf.SUBTLEX_UK, -0.1:0.1) |>
  control_for(AoA.Kuperman, -1.5:1.5) |>
  generate(20)

Here are the 20 words per factorial cell we generated.

print(stim)
knitr::kable(stim)

Check Stimuli

We can use the plot_design() function to check the distributions of the variables we used to create the emo_cond column. This shows the expected differences between conditions A1 and A2, based on the method we used to create the new column.

plot_design(stim, c("AROU.Warriner", "VAL.Warriner"))


JackEdTaylor/LexOPS documentation built on Sept. 10, 2023, 3:09 a.m.