$\$
# install.packages("latex2exp") library(latex2exp) knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(fig.width=6, fig.height=4) set.seed(230)
# get some images that are used in this document SDS230::download_image("gingko_pills.jpg") SDS230::download_data("gingko_RCT.rda")
$\$
$\$
Joy Milne claimed to have the ability to smell whether someone had Parkinson’s disease.
To test this claim, researchers gave Joy 6 shirts that had been worn by people who had Parkinson’s disease and 6 people who did not.
Joy identified 11 out of the 12 shirts correctly.
Let's run a hypothesis test to assess whether there is significant evidence to suggest that Joy can really could smell whether someone has Parkinson's disease.
$\$
In words:
Using symbols
$H_0: \pi = 0.5$ $H_A: \pi > 0.5$
Rules of the game
If there is a less than 5% probability we would get a random statistic as or more extreme than the observed statistic if $H_0$ was true, then we will reject $H_0$ and say that $H_A$ is likely to be true.
$\alpha = 0.05$
$\$
(obs_stat <- 11/12)
$\$
flip_sims <- rbinom(10000, 12, .5) flip_sims_prop <- flip_sims/12 barplot(table(flip_sims), xlab = "Number of heads (i.e., correct guesses)", ylab = "Number of simulations", main = "10,000 simulations of 12 coin flips") hist(flip_sims_prop, breaks = 100) table(flip_sims)
$\$
(p_value <- sum(flip_sims_prop >= obs_stat)/length(flip_sims))
$\$
Since r p_value
is less than $\alpha = 0.05$ we can reject the null hypothesis (and perhaps say the results are "statistically significant").
$\$
Questions
Do you believe Joy can really smell Parkinson's disease?
Is it better to report the actual p-value or just whether we rejected the null hypothesis $H_0$?
$\$
Let's us examine the randomized controlled trial experiment by Solomon et al (2002) to see if there is evidence that taking a gingko pills improves memory. To read the original paper see: https://jamanetwork.com/journals/jama/fullarticle/195207
$H_0: \mu_{gingko} - \mu_{control} = 0$ $H_A: \mu_{gingko} - \mu_{control} \ne 0$
$\alpha = 0.05$
$\$
load("gingko_RCT.rda") # plot the data boxplot(gingko, placebo, names = c("Gingko", "Placebo"), ylab = "Memory score") # create a stripchart data_list <- list(gingko, placebo) stripchart(data_list, group.names = c("Gingko", "Placebo"), method = "jitter", xlab = "Memory score", col = c("red", "blue"))
$\$
(obs_stat <- mean(gingko) - mean(placebo))
$\$
# combine the data from the treatment and placebo groups together combined_data <- c(gingko, placebo) n_gingko <- length(gingko) total <- length(combined_data) # use a for loop to create shuffled treatment and placebo groups and shuffled statistics null_distribution <- NULL for (i in 1:10000) { # shuffle data shuff_data <- sample(combined_data) # create fake treatment and control groups shuff_gingko <- shuff_data[1:n_gingko] shuff_placebo <- shuff_data[(n_gingko + 1):total] # save the statistic of interest null_distribution[i] <- mean(shuff_gingko) - mean(shuff_placebo) } # plot the null distribution as a histogram hist(null_distribution, breaks = 20, main = "Null distribution", xlab = TeX("$\\bar{x}_{shuff-gingko} - \\bar{x}_{shuff-placebo}$"))
$\$
# plot the null distribution again with a red line a the value of the observed statistic hist(null_distribution, breaks = 20, main = "Null distribution", xlab = TeX("$\\bar{x}_{shuff-gingko} - \\bar{x}_{shuff-placebo}$")) abline(v = obs_stat, col = "red") # calculate the p-value (p_value_left_tail <- sum(null_distribution <= obs_stat)/10000) (p_value_right_tail <- sum(null_distribution >= abs(obs_stat))/10000) (p_value <- p_value_left_tail + p_value_right_tail)
$\$
Since r p_value
is greater than $\alpha = 0.05$ we can not reject the null hypothesis. Thus if we are using the Neyman-Pearson paradigm, we do not have sufficient evidence to say that the pill is effective.
$\$
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.