anscombe_outlier: Anscombe's Quartet Outlier Data

anscombe_outlierR Documentation

Anscombe's Quartet Outlier Data

Description

This dataset contains 11 observations generated by Francis Anscombe to demonstrate that statistical summary measures alone cannot capture the full relationship between two variables (here, x and y). Anscombe emphasized the importance of visualizing data prior to calculating summary statistics.

Usage

anscombe_outlier

Format

A dataframe with 11 rows and 2 variables:

  • x: the x-variable

  • y: the y-variable

Details

This Dataset has a linear relationship between x and y with a single outlier

Additionally, the following statistical summaries hold:

  • mean of x: 9

  • variance of x: 11

  • mean of y: 7.5

  • variance of y: 4.125

  • correlation between x and y: 0.816

  • linear regression between x and y: ⁠y = 3 + 0.5x⁠

  • R^2 for the regression: 0.67

References

Anscombe, F. J. (1973). "Graphs in Statistical Analysis". American Statistician. 27 (1): 17–21. doi:10.1080/00031305.1973.10478966. JSTOR 2682899.


quartets documentation built on April 14, 2023, 12:25 a.m.