anscombe2: Teaching the paired t test
In PairedData: Paired Data Analysis

Description Usage Format Source References Examples

This dataset presents four sets of paired samples (n=15), giving the same t statistic (t=2.11) and thus the same p-value whereas their situations are really diversified (differences in variances, clustering, heteroscedasticity). The importance of plotting data is thus stressed. The name is given from the famous Anscombe's dataset created to study simple linear regression.

1	data(anscombe2)

A dataframe with 15 rows, 8 numeric columns of paired data: (X1,Y1) ; (X2,Y2) ; (X3,Y3) ; (X4,Y4), and 1 factor column: Subjects, giving a label for the subjects.

S. Champely, CRIS, Lyon 1 University, FRANCE

F. Anscombe, Graphs in statistical analysis. The American Statistican, 27, 17-21.

data(anscombe2)
# p=0.05 for the paired t-test
with(anscombe2,plot(paired(X1,Y1),type="BA"))
with(anscombe2,t.test(paired(X1,Y1)))

# Same p but Var(X2)<Var(Y2) and
# correlation in the Bland-Altman plot
with(anscombe2,t.test(paired(X2,Y2)))
with(anscombe2,summary(paired(X2,Y2)))
with(anscombe2,plot(paired(X2,Y2),type="BA"))

# Same p but two clusters
with(anscombe2,plot(paired(X3,Y3),type="BA"))

# Same p but the difference is "linked" to the mean
with(anscombe2,plot(paired(X4,Y4),type="BA"))

Loading required package: MASS
Loading required package: gld
Loading required package: mvtnorm
Loading required package: lattice
Loading required package: ggplot2

Attaching package: 'PairedData'

The following object is masked from 'package:base':

    summary


	Paired t-test

data:  X1 and Y1
t = 2.1174, df = 14, p-value = 0.05261
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.01395401  2.17128734
sample estimates:
mean of the differences 
               1.078667 


	Paired t-test

data:  X2 and Y2
t = 2.1135, df = 14, p-value = 0.05299
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.1777987 24.1777987
sample estimates:
mean of the differences 
                     12 

$stat
         n mean median      trim        sd   IQR (*) median ad (*) mean ad (*)
X2 (x)  15   12   12.0 12.000000  4.472136  5.189029        5.9304    4.679039
Y2 (y)  15    0    0.0  0.000000 22.360680 25.945145       29.6520   23.395197
x-y     15   12   11.0 12.333333 21.990258 26.686434       32.6172   22.308992
(x+y)/2 15    6    9.5  6.166667 11.794369 13.343217       16.3086   12.574919
            sd(w)   min  max
X2 (x)   5.200007   5.0 19.0
Y2 (y)  26.000037 -35.0 35.0
x-y     28.598223 -25.0 43.0
(x+y)/2 13.231295 -13.5 22.5

$cor
            cor      wcor
(x,y) 0.1821429 0.1410256