## Do not delete this! ## It loads the s20x library for you. If you delete it ## your document may not compile it. require(s20x) knitr::opts_chunk$set( dev = "png", fig.ext = "png", dpi = 96 ) require(emmeans)
We want to quantify the expected final exam mark (out of 100) in Stats 20x for each type of degree. In particular, we want to investigate whether there is a "degree" effect on the final exam mark.
The variables of interest were:
Exam: A student's exam mark out of 100.Degree: A four-level factor with levels corresponding to a student's degree.Is the degree a student is enrolled for related to their final 20x exam score?
load(system.file("extdata", "Stats20x.df.rda", package = "s20x"))
Stats20x.df = read.table("STATS20x.txt", header = T)
Stats20x.df$Degree=factor(Stats20x.df$Degree) #Draw boxplot plot(Exam ~ Degree, data = Stats20x.df) #Summary stats: summaryStats(Exam ~ Degree, Stats20x.df)
The "BSc" group is centred noticeably lower than the others. The standard deviations are within a factor of two from smallest to largest, so we can accept the equality of variance assumption. (The midspreads do exceed the factor-of-two rule-of-thumb, so we might need to be cautious in our interpretations.)
degree.fit = lm(Exam ~ Degree, data = Stats20x.df) modelcheck(degree.fit) anova(degree.fit) summary(degree.fit)
options(digits=4) pairs(emmeans(degree.fit, ~Degree), infer=T)
options(digits=4) stats.pair=as.data.frame(pairs(emmeans(degree.fit, ~Degree), infer=T)) conf1=subset(stats.pair, p.value<0.05) resultStr1 = paste0(sprintf("%.0f", conf1$lower.CL), " and ", sprintf("%.0f", conf1$upper.CL)) resultStr2 = paste0(sprintf("%.0f", abs(conf1$upper.CL)), " and ", sprintf("%.0f", abs(conf1$lower.CL)))
We wish to explain exam marks using degree, a factor with four levels, so we fitted a One-way ANOVA model to these data.
The model assumptions seem satisfied.
Our final model is $$\text{Exam}_i = \beta_0 + \beta_1 \times Degree.BCom_i + \beta_2 \times Degree.BSc_i + \beta_3 \times Degree.Other_i + \epsilon_i,$$ where $Degree.x_i$ is 1 if a student is enrolled in degree $x$ and 0 otherwise (with $x \in {\text{BCom}, \text{BSc}, \text{Other}}$), and $\epsilon_i \sim iid~N(0,\sigma^2)$.
Alternatively, our final model could be written as $$\text{Exam}{ij} = \mu + \alpha_i + \epsilon{ij},$$ where $\mu$ is the overall mean exam mark and $\alpha_i$ is the effect of being in the $i$th degree (with $i \in {\text{BA}, \text{BCom}, \text{BSc}, \text{Other}}$), and $\epsilon_{ij} \sim iid~N(0,\sigma^2)$.
Our model explained 13.2% of the variability in students' exam marks.
Is the degree a student is enrolled in related to their final 20x exam mark?
We do have evidence that expected exam marks were not identical between the four degree groups (Ba, BCom, BSc, and Other). However, the only significant differences we found were that BSc students had lower marks than BCom and Other degree students.
With 95% confidence we can say that:
r resultStr1[1] marks.r resultStr2[2] marks.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.