Case Study 3.1: Exam by itself

# Do not delete this!
# It loads the s20x library for you. If you delete it 
# your document may not compile it.
require(s20x)

knitr::opts_chunk$set(
  dev = "png",
  fig.ext = "png",
  dpi = 96
)

Problem

We wish to investigate the distribution of exam marks. In particular, we want to test the hypothesis that the underlying mean value of exam score is the ``historical average'' of 55.

The variable of interest is:

Question of Interest

We were interested in building a model to describe exam marks. In particular, we want to test the hypothesis that the underlying mean value of exam score is the ``historical average'' of 55.

Read in and Inspect Data

load(system.file("extdata", "Stats20x.df.rda", package = "s20x"))
Stats20x.df = read.table("STATS20x.txt", header = T)
hist(Stats20x.df$Exam,xlab="Exam",main="")
summaryStats(Stats20x.df$Exam)
hist(Stats20x.df$Exam,xlab="Exam",main="")
summaryStats(Stats20x.df$Exam)

The exams marks are centred just above 50. The data look reasonably unimodal and symmetrical -- roughly normal. Some slight right-skewness, but does not look like a problem.

Fitting the null model

By hand...

( mn_exam = mean(Stats20x.df$Exam) ) # Sample mean
( sd_exam = sd(Stats20x.df$Exam) ) # Sample standard deviation
( n_exam = length(Stats20x.df$Exam) ) # Sample size
( tmult_exam = qt(1 - 0.05/2, df = n_exam - 1) ) # t-multiplier
( CI_exam = mn_exam + tmult_exam * c(-1, 1) * sd_exam/sqrt(n_exam) ) # Confidence Interval
( se_exam = sd_exam/sqrt(n_exam) ) # Standard error
(t_stat_exam = (mn_exam - 55)/(se_exam) ) # t-stat
(pval_exam = 2 * (1 - pt(abs(t_stat_exam), df = n_exam - 1)) ) # p-value

Using lm...

examNull.fit55 = lm(I(Exam-55) ~ 1, data = Stats20x.df)
( pval_exam = coef(summary(examNull.fit55))[4] )
55+confint(examNull.fit55) 
cf = 55+confint(examNull.fit55) 
resultConf = paste0(sprintf("%.1f", cf[1]), " to ", sprintf("%.1f", cf[2]))

Finaly, with the t.test function

t.test(Stats20x.df$Exam, mu = 55)

Method and Assumption Checks

There are no explanatory variables, and so a null model was fitted.

From examining the histogram it appears that the data are roughly normally distributed, so model assumptions are satisfied.

Our final model is $$Exam_i=\beta_0 + \epsilon_i \text{ (or }Exam_i=\mu + \epsilon_i \text{)} ~,$$ where $\epsilon_i \sim iid ~ N(0,\sigma^2)$

Executive Summary

We were interested in building a model to describe exam marks.

We estimate the expected exam mark to be between r resultConf[1] (out of 100).

We have no reason to believe that the expected exam mark differs from the historical average value of 55 (out of 100) (P-value = 0.17).



Try the s20x package in your browser

Any scripts or data that you put into this service are public.

s20x documentation built on Jan. 14, 2026, 9:07 a.m.