adventr: hypothesis testing

library(learnr)
library(tidyverse)
library(BayesFactor)
library(effsize)


knitr::opts_chunk$set(echo = FALSE, warning = FALSE)
tutorial_options(exercise.cap = "Exercise")

#Read dat files needed for the tutorial

teddy_tib <- adventr::teddy_dat;
zhang_female_tib <- adventr::zhang_female_dat

#setup objects for code blocks

teddy_sum <- teddy_tib %>% dplyr::group_by(study_n, group) %>% dplyr::summarize(
  mean = mean(self_esteem),
  sd = sd(self_esteem),
  ci_low = ggplot2::mean_cl_normal(self_esteem)$ymin,
  ci_upp = ggplot2::mean_cl_normal(self_esteem)$ymax
)

An Adventure in R: Hypothesis testing

Overview

This tutorial is one of a series that accompanies An Adventure in Statistics [@RN10163] by me, Andy Field. These tutorials contain abridged sections from the book so there are some copyright considerations but I offer them under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, ^[Basically you can use this tutorial for teaching and non-profit activities but do not meddle with it or claim it as your own work.]

Story précis

Why a précis?

Because these tutorials accompany my book An adventure in statistics, which uses a fictional narrative to teach the statistics, some of the examples might not make sense unless you know something about the story. For those of you who don't have the book I begin each tutorial with a précis of the story. If you're not interested then fair enough - click past this section.

General context for the story

It is the future. Zach, a rock musician and Alice, a geneticist, who have been together since high school live together in Elpis, the ‘City of Hope’.

Zach and Alice were born in the wake of the Reality Revolution which occurred after a Professor Milton Gray invented the Reality Prism – a transparent pyramid worn on the head – that brought honesty to the world. Propaganda and media spin became unsustainable, religions collapsed, advertising failed. Society could no longer be lied to. Everyone could know the truth about anything that they could look at. A gift, some said, to a previously self-interested, self-obsessed society in which the collective good had been eroded.

But also a curse. For, it soon became apparent that through this Reality Prism, people could no longer kid themselves about their own puffed-up selves as they could see what they were really like – by and large, pretty ordinary. And this caused mass depression. People lost faith in themselves. Artists abandoned their pursuits, believing they were untalented and worthless.

Zach and Alice have never worn a Reality Prism and have no concept of their limitations. They were born after the World Governance Agency (WGA) destroyed all Reality Prisms, along with many other pre-revolution technologies, with the aim of restoring community and well-being. However, this has not been straightforward and in this post-Prism world, society has split into pretty much two factions

Everyone has a star, a limitless space on which to store their digital world.

Zach and Alice are Clocktarians. Their technology consists mainly of:

Main Protagonists

How Zach's adventure begins

Alice has been acting strangely, on edge for weeks, disconnected and uncommunicative, as if she is hiding something and Zach can’t get through to her. Arriving home from band practice, unusually, she already home and listening to an old album that the two of them enjoyed together, back in a simpler, less complicated time in their relationship. During an increasingly testy evening, that involves a discussion with the Head about whether or not a Proteus causes brain cancer, Alice is interrupted by an urgent call which she takes in private. She returns looking worried and is once again, distracted. She tells Zach that she has ‘a big decision to make’. Before going to bed, Zach asks her if he can help with the decision but she says he ‘already has’, thanking him for making ‘everything easier.’ He has no idea what she means and goes to sleep, uneasy.

On waking, Zach senses that something is wrong. And he is right. Alice has disappeared. Her clothes, her possessions and every photo of them together have gone. He can’t get hold of any of her family or friends as their contact information is stored on her Proteus, not on his diePad. He manages to contact the Beimeni Centre but is told that no one by the name of Alice Nightingale has ever worked there. He logs into their constellation but her star has gone. He calls her but finds that her number never existed. She has, thinks Zach, been ‘wiped from the planet.’ He summons The Head but he can’t find her either. He tells Zach that there are three possibilities: Alice has doesn’t want to be found, someone else doesn’t want her to be found or she never existed.

Zach calls his friend Nick, fellow band member and fan of the WGA-installed Repositories, vast underground repositories of actual film, books, art and music. Nick is a Chipper – solely for the purpose of promoting the band using memoryBank – and he puts the word out to their fans about Alice missing.

Thinking as hard as he can, Zach recalls the lyrics of the song she’d been playing the previous evening. Maybe they are significant? It may well be a farewell message and the Head is right. In searching for clues, he comes across a ‘memory stone’ which tells him to read what’s on there. File 1 is a research paper that Zach can’t fathom. It’s written in the ‘language of science’ and the Head offers to help Zach translate it and tells him that it looks like the results of her current work were ‘gonna blow the world’. Zach resolves to do ‘something sensible’ with the report.

Zach doesn’t want to believe that Alice has simply just left him. Rather, that someone has taken her and tried to erase her from the world. He decides to find her therapist, Dr Murali Genari and get Alice’s file. As he breaks into his office, Dr Genari comes up behind him and demands to know what he is doing. He is shaking but not with rage – with fear of Zach. Dr Genari turns out to be friendly and invites Zach to talk to him. Together they explore the possibilities of where Alice might have gone and the likelihood, rating her relationship satisfaction, that she has left him. During their discussion Zach is interrupted by a message on his diePad from someone called Milton. Zach is baffled as to who he is and how he knows that he is currently discussing reverse scoring. Out of the corner of his eye, he spots a ginger cat jumping down from the window ledge outside. The counsellor has to go but suggests that Zach and ‘his new friend Milton’ could try and work things out.

Packages and data

Packages

This tutorial uses the following packages:

These packages are automatically loaded within this tutorial. If you are working outside of this tutorial (i.e. in RStudio) then you need to make sure that the package has been installed by executing install.packages("package_name"), where package_name is the name of the package. If the package is already installed, then you need to reference it in your current session by executing library(package_name), where package_name is the name of the package.

Data

This tutorial has the data files pre-loaded so you shouldn't need to do anything to access the data from within the tutorial. However, if you want to play around with what you have learnt in this tutorial outside of the tutorial environment (i.e. in a stand-alone RStudio session) you will need to download the data files and then read them into your R session. This tutorial uses the following file:

You can load the files in several way (using the first file as an example):

Adapt the above instructions to load the second data file into a tibble called zhang_female_tib.

A cuddly example

Zach has just escaped from the Secret Philanthropic Society, who seem to do bizarre rituals that involve their members shoving their hands into boxes that may or may not contain a lethal gernal worm. They use probability to determine whether the box will be empty. He meets Emily, the member whose ritual he witnessed. She is riddled with doubts about the societies methods, but is powerless to escape. With Zach's help she does, and they are led by a strange moon-faced druid into a tombstone. It's never a good idea idea to follow moon-faced druids into tombstones. Nevertheless they do. He encounters the Doctrine of Chance led by Sister Price. They explain alternative methods to test evidence: specifically effect sizes and Bayes factors.

We're going to use the data in the tibble called teddy_tib, which contains data from two studies looking at whether cuddling a teddy bear (compared to the cardboard box that the teddy was packaged in) affects self-reported self-esteem. This tibble contains 4 variables:

Using what you have learnt to date, can you create a tibble called teddy_sum containing the means, standard deviations and confidence intervals for each group across the two studies. (If you're doing this outside of the tutorial remember to load tidyverse [@RN11407] and Hmisc [@RN11417]).


teddy_sum <- teddy_tib %>%
  dplyr::group_by(study_n, group) %>%
  dplyr::summarize(
    mean = mean(self_esteem),
    sd = sd(self_esteem),
    ci_low = ggplot2::mean_cl_normal(self_esteem)$ymin,
    ci_upp = ggplot2::mean_cl_normal(self_esteem)$ymax
    )

Now try plotting an error bar graph using facet_wrap() to display graphs for the two studies side by side. use coord_cartesian() to set the y-limits to be 0-20, scale_y_continuous() to set ticks along the axis every two (e.g., 0, 2, 4, 6 ....), and labs() to define labels foe the x- and y-axes.


ted_plot <- ggplot2::ggplot(teddy_tib, aes(group, self_esteem))
ted_plot +
  stat_summary(fun.data = "mean_cl_normal", size = 1) +
  facet_wrap(~ study_n) +
  coord_cartesian(ylim = c(0,20)) + 
  scale_y_continuous(breaks = seq(0, 20, 2)) +
  labs(x = "Experimental condition", y = "Self-esteem (0-20)") +
  theme_bw()

Effect sizes

Cohen's d

A useful measure of effect size is Cohen’s d, which is the difference between two means divided by some estimate of the standard deviation of those means:

$$ \hat{d} = \frac{\bar{X_1}-\bar{X_2}}{s} $$

I have put a hat on the d to remind us that we’re really interested in the effect size in the population, but because we can’t measure that directly, we estimate it from the samples (The hat means ‘estimate of’). By dividing by the standard deviation we are expressing the difference in means in standard deviation units (a bit like a z–score). The standard deviation is a measure of ‘error’ or ‘noise’ in the data, so d is effectively a signal-to-noise ratio. However, if we’re using two means, then there will be a standard deviation associated with each of them so which one should we use? There are three choices:

  1. If one of the group is a control group it makes sense to use that groups standard deviation to compute d (the argument being that the experimental manipulation might affect the standard deviation of the experimental group, so the control group SD is a ‘purer’ measure of natural variation in scores)
  2. Sometimes we assume that group variances (and therefore standard deviations) are equal (homogeneity of variance) and if they are we can pick a standard deviation from either of the groups because it won’t matter.
  3. We use what’s known as a ‘pooled estimate’, which is the weighted average of the two group variances. This is given by the following equation:

$$ s_p = \sqrt{\frac{(N_1-1) s_1^2+(N_2-1) s_2^2}{N_1+N_2-2}} $$

Say we wanted to estimate d for the difference between self-esteem after cuddling the teddy compared to the box (control), assuming you successfully created the tibble (teddy_sum) then you should have this information:

knitr::kable(teddy_sum, caption = "Summary statistics for the teddy and box groups within the two experiments")

We have a logical control group so d is simply:

$$ \hat{d}_\text{teddy vs box} = \frac{15-10}{5.96} = 0.839 $$

If we use the pooled s then we'd get:

$$ \begin{aligned} \ s_p &= \sqrt{\frac{(N_1-1) s_1^2+(N_2-1) s_2^2}{N_1+N_2-2}} \ \ &= \sqrt{\frac{(10-1)5.96^2+(10-1)6.02^2}{10+10-2}} \ \ &= \sqrt{\frac{645.86}{18}} \ \ &= 5.99 \end{aligned} $$

Basically, because the group standard deviations are more or less the same (5.96 and 6.02) the pooled estimate is very similar too (it falls between those two values). Consequently, the resulting effect size is not going to change much when we use the pooled estimate:

$$ \hat{d}_\text{teddy vs box} = \frac{15-10}{5.99} = 0.835 $$

Let's look at computing Cohen's d using the pooled estimate in R. We're going to do this separately for the two studies in out tibble (teddy_tib). As a brief recap, this tibble contains 4 variables:

Like many things in R, there are numerous packages that offer functions to compute effect sizes. We're going to use the cohen.d() function from the effsize package [@RN11406] primarily because it computes it from the raw data (rather than from summary statistics) and because it has an option to compute it in designs that use repeated measures (i.e. the means across conditions are dependent). The general format of this function is:

cohen.d(formula = outcome ~ group_variable, pooled = TRUE, paired = FALSE, na.rm = FALSE, hedges.correction = FALSE, conf.level = 0.95)

The arguments within this function are:

The function returns an object that contains the value of d, the confidence interval and some other information. We're going to use a pipe (%>%) to link:

teddy_tib %>%
  dplyr::filter(study_n == "Total N = 20") %>%
  effsize::cohen.d(formula = self_esteem ~ group, data = .)

This command takes the teddy_tib tibble, extracts the study based on 20 participants, then applies the cohen.d() function. Within this function we have left the defaults as they are, our formula specifies that we're predicting self_esteem from the variable group (which defines whether the individual hugs a teddy or a box), and we specify the data within the function with a period (data = .). By using a period we're telling the function to use the data coming through from the pipe. Copy this command in the code box below and run the code. Then edit it to apply Hedges' correction, then edit the code again to get the effect size for the experiment that had 200 participants.


# To get the Hedges' correction we'd add hedges.correction = T:

teddy_tib %>%
  dplyr::filter(study_n == "Total N = 20") %>%
  effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)

# To get effectsizes we'd change the filter() function to extract the study based on 200 participants:

teddy_tib %>%
  dplyr::filter(study_n == "Total N = 200") %>%
  effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)

Applying the Hedges' correction to both studies, we'd get these outputs:

teddy_tib %>%
  dplyr::filter(study_n == "Total N = 20") %>%
  effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)

teddy_tib %>%
  dplyr::filter(study_n == "Total N = 200") %>%
  effsize::cohen.d(formula = self_esteem ~ group, data = ., hedges.correction = T)

This shows that for the smaller study, the effect of cuddling a teddy compared to a box on self-esteem was g = -0.80 [-1.78, 0.18]. Whether d (or g) has a positive or negative sign reflects which way around you subtracted the group means. The teddy group had a mean of 15 and and box group a mean of 10. The difference between these means will be 5 if you subtract the box mean from the teddy mean (15 - 10 = 5), but -5 if you subtract the teddy mean from the box mean (10 - 15 = -5). So, to interpret the effect size look at the group means (not the plus of minus sign of g). In this case, self-esteem is 0.80 of a standard deviation higher after cuddling a teddy than after cuddling a box. The confidence interval for this effect size suggests (if we assume this sample is one of the 95% that contain the population value) that the population effect could range between -1.78 and 0.18. Crucially this means the effect could be 0, and could reflect higher self-esteem in the box group OR in the teddy group.

  question("In the larger study the confidence interval ranged from -1.13 to -0.54. What does this tell us?",
    answer("If this confidence interval is one of the 95% that contains the population value then the population value of the difference between group means lies between -1.13 to -0.54.", correct = TRUE),
    answer("There is a 95% chance that the population value of the difference between group means lies between -1.13 to -0.54.", message = "You cannot make probability statements from a confidence interval. We don't know whether this particular CI is one of the 95% that contains the population value of the difference between means."),
    answer("The probability of this confidence interval containing the population value is 0.95.", message = "The probability of this confidence interval containing the population value is either 0 (it doesn't) or 1 (it does) but it's impossible to know which."),
    answer("I can be 95% confident that the population value of the difference between group means lies between -1.13 to -0.54.", message = "Confidence intervals do not quantify your subjective confidence."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )

An optional tricky section

I've struggled to find a function that will automatically calculate d using the standard deviation of the control group. However, it is simple enough to get R to subtract two values and then divide by another value. For example, to add x and y and then divide by z you could execute (x + y)/z. Simple. What's not simple is re-arranging the data so that we have the group means in columns. It can be done but involves restructuring the data in a crazy pipe involving lots of functions that we haven't learnt! I'm going to just plough through how it's done, explain what everything does but not cover the new functions in detail. So, by all means ignore this section if you like.

Earlier we created a tibble called teddy_sum that includes the means, standard deviations and confidence intervals of those means. The problem is that the means and sds for the box and teddy groups are in different rows and we want them in different columns so that we can subtract them. This is to remind you of what the tibble currently contains:

knitr::kable(teddy_sum, caption = "Summary statistics for the teddy and box groups within the two experiments")

I'm going to restructure this tibble so that all of the information from a given experiment is in a single row (that is, create columns contain the means, sds and confidence intervals for each group). That's the hard part. Then I will add a column that compotes d (that's the easy part!).

Restructuring data using the tidyverse package tidyr is something that I find endlessly confusing, so if you find it confusing too then don't worry - you're in good company! To restructure the data we're going to use this pipe:

teddy_wide <-  teddy_sum %>%
  tidyr::gather(measure, value, -c(study_n, group)) %>%
  tidyr::unite(gp_measure, group, measure) %>% 
  tidyr::spread(gp_measure, value)

It's a pipe that could carry gas across the Atlantic. Let's break it down.


teddy_wide <-  teddy_sum %>%
  tidyr::gather(measure, value, -c(study_n, group)) %>%
  tidyr::unite(gp_measure, group, measure) %>% 
  tidyr::spread(gp_measure, value)
teddy_wide

Run the full pipe to create this new tibble (which I've called teddy_wide). Execute the name of the tibble to look at it. The resulting tibble contains these variables

The variables for the teddy bear group have been named a bit clunkily (because we'd defined the group as 'Teddy Bear'). Let's tidy these names up using the rename() function:

teddy_wide <- teddy_wide %>%
  rename(
   teddy_ci_low = `Teddy Bear_ci_low`,
   teddy_ci_upp = `Teddy Bear_ci_upp`,
   teddy_mean = `Teddy Bear_mean`,
   teddy_sd = `Teddy Bear_sd`
  )

This code recreates the teddy_wide tibble from itself but creates a variable teddy_ci_low from the variable currently called `Teddy Bear_ci_low` and do on. In effect I'm renaming the 'Teddy Bear' variables to omit spaces and remove upper case letters (and be shorter). Try this in the code box and inspect the resulting tibble.

teddy_wide <-  teddy_sum %>%
  tidyr::gather(measure, value, -c(study_n, group)) %>%
  tidyr::unite(gp_measure, group, measure) %>%
  tidyr::spread(gp_measure, value)

teddy_wide <- teddy_wide %>%
  rename(
   teddy_ci_low = `Teddy Bear_ci_low`,
   teddy_ci_upp = `Teddy Bear_ci_upp`,
   teddy_mean = `Teddy Bear_mean`,
   teddy_sd = `Teddy Bear_sd`
  )
teddy_wide

Well done if you're still with me! Let's now do the (relatively) easy bit. Now we have the group means in columns, it's relatively straightforward to use mutate() to add a column that contains d:

teddy_wide <- teddy_wide %>%
  dplyr::mutate(
  d = (teddy_mean - Box_mean)/Box_sd
)

This code re-creates the teddy_wide tibble from itself, but then uses mutate() to add a variable called d, which is defined as (teddy_mean - Box_mean)/Box_sd. In other words it subtracts the scores in the column Box_mean from the scores in the column teddy_mean and divides the result by the scores in the column Box_sd. In other words, it takes the difference between the group means and divides by the standard deviation for the control group! Try this below and then view the resulting tibble to see the values of d.

teddy_wide <-  teddy_sum %>%
  tidyr::gather(measure, value, -c(study_n, group)) %>%
  tidyr::unite(gp_measure, group, measure) %>%
  tidyr::spread(gp_measure, value)

teddy_wide <- teddy_wide %>%
  rename(
   teddy_ci_low = `Teddy Bear_ci_low`,
   teddy_ci_upp = `Teddy Bear_ci_upp`,
   teddy_mean = `Teddy Bear_mean`,
   teddy_sd = `Teddy Bear_sd`
  )

teddy_wide <- teddy_wide %>%
  dplyr::mutate(
  d = (teddy_mean - Box_mean)/Box_sd
)
teddy_wide
question("Which of these statements about Cohen's *d* is **NOT** correct?",
    answer("The value of *d* cannot exceed 1.", correct = TRUE, message = "This statement is false and so is the coirrect answer."),
    answer("*d* is the difference between two means expressed in standard deviation units.", message = "This statement is true so is not the correct answer."),
    answer("A *d* of 0.2 would be considered small", message = "This statement is true so is not the correct answer."),
    answer("*d* can be computed using a control group standard deviation, the standard deviation of all scores or a pooled standard deviation.", message = "This statement is true so is not the correct answer."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T
  )

Bayes factors

Bayes factors and priors

The Bayes factor quantifies the probability of the data given the alternative hypothesis (in this case that self-esteem differs in the teddy and box groups) relative to the probability of the data given the null hypothesis (in this case that self-esteem is the same in the teddy and box groups):

$$ \text{Bayes factor} = \frac{p(\text{data}|\text{alternative})}{p(\text{data}|\text{null})} \ $$

The Bayesian t-test uses a default prior distribution based on a value, r, which scales the distribution [@RN9316]. The advantage of this approach is that it enables people to use a Bayesian method without needing to choose a prior from an almost endless set of possibilities. By con straining the decisions, some of the hard thinking is removed. The disadvantage is that defaults can tempt people to use them without any thought, and it’s tricky to understand what’s going on under the hood. To encourage you to give some thought to your prior, let’s look under the hood (just a bit). The prior distribution is constrained to be from a family known as the Cauchy distribution. Members of the Cauchy family look a bit like a normal distributions, but they’re not. Figure 1 shows Cauchy distributions with different scale factors, r: that are medium, r = 0.5 (left); wide, r = 0.7071 (middle), and ultrawide, r = 1 (right). Each curve has a shaded region with boundaries from −r to +r and the region under the curve between these boundaries represents 50% of the area. Note that the width of this region gets wider as r increases. When you set the value of r you are assigning a 50% probability that the effect size (Cohen’s d) lies between −r and +r. Given the distribution is symmetrical, you’re also assigning a 25% prior belief that d is greater than r and the remaining 25% that d is less than −r.

If you use the default value of r = 0.7071 you are, therefore, saying that your prior belief is that there’s a 50% probability that the effect size (Cohen’s d) lies between −0.7071 and +0.7071, a 25% probability that it is greater than +0.7071 and a 25% probability that it is less than −0.7071. Remembering what d represents, you’re saying that there’s a 50% probability that the difference between the two means lies between −0.7071 and +0.7071 standard deviations. If you set r = 0.5 (Figure 1, left) the Cauchy distribution becomes narrower and you are similarly narrowing your prior beliefs to a 50% probability that the difference between the group means lies between −0.5 and +0.5 standard deviations. Setting r = 1 (Figure 1, right) widens the distribution, representing wider beliefs. You’d be placing a 50% probability that the difference between the group means lies between −1 and +1 standard deviations. Remembering that d = 0.8 is a large effect, you’d be assigning a 50% probability to the difference between means being somewhere between a very large effect in one direction and a very large effect in the opposite direction. This is too wide a belief for most situations (bearing in mind that another 50% of the probability is assigned to the effect being outside those limits).

Figure 1: Distributions representing the *r* scale

Bayes factors in R

The BayesFactor package contains various functions for computing Bayes factors. We'll use it at various points for other models. For now, we'll use the ttestBF() function, which is designed for situations where you want a Bayes factor that quantifies differences between two means. It has a general form of:

ttestBF(formula = outcome ~ group_variable, mu = 0, paired = FALSE, data = NULL, rscale = "medium")

It has some other arguments not listed above, but these are the key ones for our purposes:

Like when we computed an effect size we're going to use a pipe (%>%) to link:

teddy_tib %>%
  dplyr::filter(study_n == "Total N = 20") %>%
  BayesFactor::ttestBF(formula = self_esteem ~ group, data = .)

This command takes the teddy_tib tibble, extracts the study based on 20 participants, then applies the ttestBF() function. You'll get a warning about data coerced from a atibble to a dataframe because the function works with dataframes so it converts your tibble to a data frame. The warning is nothing to worry about.

Within this function we have left the defaults as they are (i.e. a 'medium' prior scale value of 0.707), our formula specifies that we're predicting self_esteem from the variable group (which defines whether the individual hugs a teddy or a box), and we specify the data within the function with a period (data = .). By using a period we're telling the function to use the data coming through from the pipe. Copy this command in the code box below and run the code. Then edit it to get the Bayes factor for the experiment that had 200 participants.


teddy_tib %>%
  dplyr::filter(study_n == "Total N = 200") %>%
  BayesFactor::ttestBF(formula = self_esteem ~ group, data = .)

The resulting Bayes factor is 1.27 which means that the probability of the data given the alternative hypothesis is about the same as the probability of the data given the null. The value here suggests that we should not change our prior beliefs. By using the default prior we assigned a 50% probability to the effect size (d) lying between −0.7071 and +0.7071, and this Bayes factor tells us not to change this belief.

For the larger sample the Bayes factor is 754768.4, which is huge. The probability of the data given the alternative hypothesis is 754768.4 greater than the probability of the data given the null. The value here suggests that we should shift our prior beliefs towards the alternative hypothesis by a factor of 754768.4. In other words, having looked at the data we should hold a very much stronger belief that the means are different.

Remember that the effect sizes in the two studies were basically identical, so it may seem strange that the Bayes factors are so different. However, if you think about it, Bayesian methods update prior beliefs using the data. A small amount of data will have less impact on beliefs than a huge sample (if the priors are the same). Imagine you're asked the probability that a sports team will win their next game. You guess at 33% change that they'll win (you assume win, lose or draw are equally likely). Take two scenarios: you're told that this team won their last game. This might increase your belief in a win, but probably not by much - perhaps that game was a fluke. The second scenario is that you're told that they have won their last 20 games. You're belief in a win will probably shift quite strongly towards a win - they have consistently shown a winning mentality. In both cases you're being told that they have won 100% of past games, but in the first case that is based on a tiny amount of data, in the latter case it's based on a lot of date. This illustrates the principle of how (other things being equal) larger data sets have increasing influence.

question("What is a Bayes factor??",
    answer("The relative probability of the data given the alternative to the data given the null.", correct = TRUE),
    answer("The ratio of the probability of the alternative hypothesis given the data to the probability of the null hypothesis given the data.", message = "This statement describes the posterior odds."),
    answer("The ratio of the probability of the alternative hypothesis to the probability of the null hypothesis.", message = "This statement describes the prior odds."),
    answer("Tell me a terrible and possible inappropriate joke about Bayesians.", message = "Q. Why do Bayesians make good proctologists? A. Because they're always looking at the posterior."),
    correct = "Correct - well done!",
    random_answer_order = TRUE,
    allow_retry = T

)

Concluding task

The tibble zhang_female_tib contains a small random subsample of females from a study that looked at performance on a maths test when the test is taken under the participant's own name, or a fake name. There are two variables:

In the code box, produce some code to:


zhang_plot <- ggplot2::ggplot(zhang_female_tib, aes(name, accuracy))
zhang_plot +
  stat_summary(fun.data = "mean_cl_normal", size = 1) +
  coord_cartesian(ylim = c(0,100)) + 
  scale_y_continuous(breaks = seq(0, 100, 10)) +
  labs(x = "Experimental condition", y = "Accuracy on maths test (%)") +
  theme_bw()

#to get g:
zhang_female_tib %>%
  effsize::cohen.d(formula = accuracy ~ name, data = ., hedges.correction = T)

#to get a Bayes factor:

zhang_female_tib %>%
  BayesFactor::ttestBF(formula = accuracy ~ name, data = .)

Other resources

Statistics

R

References



Try the adventr package in your browser

Any scripts or data that you put into this service are public.

adventr documentation built on July 1, 2020, 11:50 p.m.