Case Study 9.1: Language score vs teaching method and student IQ
In s20x: Functions for University of Auckland Course STATS 201/208 Data Analysis

## Do not delete this!
## It loads the s20x library for you. If you delete it 
## your document may not compile it.
require(s20x)

knitr::opts_chunk$set(
  dev = "png",
  fig.ext = "png",
  dpi = 96
)

Problem

Educational experts were interested in which of three different teaching methods was most effective in increasing a student-tested language score for children of a range of abilities---as measured by IQ. Moreover, they wanted to know if the relative effectiveness of the methods differed according to IQ.

An experiment was conducted whereby 30 students were randomly allocated into three groups and each group was taught using a different teaching method. This randomisation was done to ensure that a range of student abilities was represented in each group. As students were in a test environment we can assume that their test scores are independent of each other.

The variables of interest were:

lang: The student's language score.
IQ: The student's IQ before the teaching programme began.
method: The type of teaching method used on the student.

Question of Interest

We wish to see if the language score achieved depended on the teaching method. We want to check for any confounding effect of IQ.

Read in and Inspect the Data

data(teach.df)
head(teach.df)
str(teach.df)
# We need to convert method into a factor variable
teach.df$method = factor(teach.df$method)
plot(lang ~ IQ, main = "Language Score versus IQ (by method)",
    pch = as.character(teach.df$method), data = teach.df)

Looking at the coded scatter plot, we can see three parallel lines. It appears that the score is increasing with IQ and that method 2 is scoring highest and method 3 is scoring lowest. The variability around these individual lines is much lower than the variability seen in the separate plots.

Model Building and Check Assumptions

teach.fit = lm(lang ~ IQ * method, data = teach.df)
plot(teach.fit, which = 1)
normcheck(teach.fit)
cooks20x(teach.fit)
anova(teach.fit)
teach.fit2 = lm(lang ~ IQ + method, data = teach.df)
plot(teach.fit2, which = 1)
normcheck(teach.fit2)
cooks20x(teach.fit2)
anova(teach.fit2)
summary(teach.fit2)
confint(teach.fit2)

Visualise the Final Model

plot(lang ~ IQ, main = "Language Score versus IQ (by method)",
    pch = as.character(teach.df$method), data = teach.df)
abline(teach.fit2$coef[1], teach.fit2$coef[2], lty = 1)
abline(teach.fit2$coef[1] + teach.fit2$coef[3], teach.fit2$coef[2], lty = 2)
abline(teach.fit2$coef[1] + teach.fit2$coef[4], teach.fit2$coef[2], lty = 4)

Generate Model Output for when `method`'s baseline is "2"

teach.df$method = relevel(teach.df$method, ref = "2")
teach.fit3 = lm(lang ~ IQ + method, data = teach.df)
confint(teach.fit3)

conf1 = as.data.frame(abs(confint(teach.fit3)[c(3,4),]))
resultStr1 = paste0(sprintf("%.1f", conf1$`97.5 %`), " and ", sprintf("%.1f", conf1$`2.5 %`))

conf2 = as.data.frame(t(abs(confint(teach.fit2)[4,])))
resultStr2 = paste0(sprintf("%.1f", conf2$`97.5 %`), " and ", sprintf("%.1f", conf2$`2.5 %`))

conf3 = as.data.frame(t(abs(confint(teach.fit2)[2,]*10)))
resultStr3 = paste0(sprintf("%.1f", conf3$`2.5 %`), " and ", sprintf("%.1f", conf3$`97.5 %`))

Method and Assumption Checks

To explain language score, we first fitted the model with explanatory variables teaching method, IQ, and their interaction. But, the interaction term was not significant (P-value = 0.37). The model was refitted with the interaction term removed.

All model assumptions were satified. [Optional: The students should be acting independent of each other as they were randomly allocated to the method taught and they students are measured under test conditions.]

Our final model is $$lang_i = \beta_0 + \beta_1 \times IQ_i + \beta_2 \times method.method2_i + \beta_3 \times method.method3_i + \epsilon_i,$$ where:

$method.method2_i$ is set to one if student $i$ received method 2, otherwise it is zero,
$method.method3_i$ is set to one if student $i$ received method 3, otherwise it is zero,
and $\epsilon_i \sim iid ~ N(0,\sigma^2)$.

Here method 1 is our baseline.

The final model was also refitted with method 2 as the baseline. Note: When we change the baseline (to level 2), the values of the dummy variables switch, so that $method.method2_i$ becomes $method.method1_i$. Hence, $method.method1_i$ is set to one if student $i$ received method 1, otherwise it is zero.

Our model explains almost r round(100*summary(teach.fit2)$r.squared)% of the variation in language score.

Executive Summary

We were interested in comparing the effectiveness of three teaching methods on language scores acheived by students. We also wanted to see how this was effected by students IQ's.

We found that the effects of the teaching methods are the same regardless of IQ and the effect of IQ is the same regardless of teaching method.

In particular teaching method 2 is significantly better than the other two methods. Also, both methods 1 and 2 are significantly better than method 3.

Not surprisingly, students with higher IQ tended to score higher.

With 95% confidence:

For students experiencing the same teaching method, we estimate that the expected language test score increases by between r resultStr3[1] marks for each additional 10 IQ points,
For students with the same IQ, we estimate that the expected language test score for students taught using method 2 is between r resultStr1[1] marks higher than those taught using method 1,
For students with the same IQ, we estimate that the expected language test score for students taught using method 1 is between r resultStr2[1] marks higher than those taught using method 3,
For students with the same IQ, we estimate that the expected language test score for students taught using method 3 is between r resultStr1[2] marks lower than those taught using method 2.

\pagebreak{}

What happens if we don't adjust for IQ?

We expect the confidence intervals for the methods to get wider because we have to "absorb" the extra variation not explained by IQ.

teach.df$method = relevel(teach.df$method, ref = "1")
teach.fit5 = lm(lang ~ method, data = teach.df)
teach.df$method = relevel(teach.df$method, ref = "2")
teach.fit6 = lm(lang ~ method, data = teach.df)
ci = confint(teach.fit5)
ci2 = confint(teach.fit6)
r2 = round(100*summary(teach.fit5)$r.squared)

With 95% confidence :

For students taught using method 2 is between r round(ci[2,1], 1) and r round(ci[2,2], 1) marks higher than those taught using method 1,
For students taught using method 3 is between r round(-ci2[3,2], 1) and r round(-ci2[3,1], 1) marks lower than those taught using method 1,
For students taught using method 3 is between r round(-ci[3,2], 1) and r round(-ci[3,1], 1) marks lower than those taught using method 2.

Our model explains almost r r2% of the variation in language score. You should be able to see that every one of these intervals is wider than before.

Multiple comparisions adjustment (Chap 11)

This will only make sense after Chapter 11:

options(digits=4)
require(emmeans)
emmeans(teach.fit2, specs = pairwise ~ method, infer=T)$contrasts

#Compare to
cbind(coef(summary(teach.fit2)),confint(teach.fit2))
cbind(coef(summary(teach.fit3)),confint(teach.fit3))

Any scripts or data that you put into this service are public.

s20x documentation built on Jan. 14, 2026, 9:07 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

s20x
Functions for University of Auckland Course STATS 201/208 Data Analysis

Case Study 9.1: Language score vs teaching method and student IQ
In s20x: Functions for University of Auckland Course STATS 201/208 Data Analysis

Problem

Question of Interest

Read in and Inspect the Data

Model Building and Check Assumptions

Visualise the Final Model

Generate Model Output for when `method`'s baseline is "2"

Method and Assumption Checks

Executive Summary

What happens if we don't adjust for IQ?

Multiple comparisions adjustment (Chap 11)

Try the s20x package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

s20x Functions for University of Auckland Course STATS 201/208 Data Analysis

Case Study 9.1: Language score vs teaching method and student IQ In s20x: Functions for University of Auckland Course STATS 201/208 Data Analysis

Problem

Question of Interest

Read in and Inspect the Data

Model Building and Check Assumptions

Visualise the Final Model

Generate Model Output for when method's baseline is "2"

Method and Assumption Checks

Executive Summary

What happens if we don't adjust for IQ?

Multiple comparisions adjustment (Chap 11)

Try the s20x package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

s20x
Functions for University of Auckland Course STATS 201/208 Data Analysis

Case Study 9.1: Language score vs teaching method and student IQ
In s20x: Functions for University of Auckland Course STATS 201/208 Data Analysis

Generate Model Output for when `method`'s baseline is "2"