knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(Stat231) library(tidyverse) library(openintro)
For this lab, we will be peaking at section 7.5 (without worry about all the formulas). The analysis of variance - anova - test is equivalent to the chi-squared test but for numeric data instead of categorical data; It will compare the means of three or more samples.
We will be going back to the diamonds data set and running an anova test to see if there is a difference in the average price of diamonds based on clarity. We will nest the test inside a summary()
function to get the test and print out of results in one line.
summary(aov(price ~ clarity, data = diamonds))
The important parts of this print out are the $F$ value and $Pr(>F)$. The $F$ value is similar to a $z$ or $t$ score. The $Pr(>F)$ is a p-value.
Using the p-value method, $<2e-16$ means the p-value is less than $0.0000000000000002$. Using any of the typical values for alpha, we would reject our null hypothesis and can say there is a difference in the average price of diamonds based on their clarity rating.
Using the classic method, we need to calculate a critical F value.
qf(0.95, #1-alpha df1 = 7, df2 = 53932)
This means our critical region starts at approximately $F = 2$, and our test statistic is $F = 215$. We are well into the critical region and again conclude that there is a difference in the average price of diamonds based on their clarity.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.