HistData-package: Data sets from the History of Statistics and Data...

HistData-packageR Documentation

Data sets from the History of Statistics and Data Visualization

Description

The HistData package provides a collection of data sets that are interesting and important in the history of statistics and data visualization. The goal of the package is to make these available, both for instructional use and for historical research.

Some of the data sets have examples which reproduce an historical graph or analysis. These are meant mainly as starters for more extensive re-analysis or graphical elaboration. Some of these present graphical challenges to reproduce in R.

They are part of a program of research called statistical historiography, meaning the use of statistical methods to study problems and questions in the history of statistics and graphics. A main aspect of this is the increased understanding of historical problems in science and data analysis trough the process of trying to reproduce a graph or analysis using modern methods. I call this "Re-visioning", meaning to see again, hopefully in a new light.

A number of these are illustrated in our book, Friendly & Wainer (2021), A History of Data Visualization and Graphic Communication, and some are re-produced in R in the companion web site, https://friendly.github.io/HistDataVis/.

Details

Descriptions of each DataSet can be found using help(DataSet); example(DataSet) will likely show applications similar to the historical use.

Data sets included in the HistData package are:

Arbuthnot

Arbuthnot's data on male and female birth ratios in London from 1629-1710

Armada

The Spanish Armada

Bowley

Bowley's data on values of British and Irish trade, 1855-1899

Breslau

Halley's Breslau Life Table

Cavendish

Cavendish's 1798 determinations of the density of the earth

ChestSizes

Quetelet's data on chest measurements of Scottish militiamen

Cholera

William Farr's Data on Cholera in London, 1849

CholeraDeaths1849

Daily Deaths from Cholera and Diarrhaea in England, 1849

CushnyPeebles

Cushny-Peebles data: Soporific effects of scopolamine derivatives

Dactyl

Edgeworth's counts of dactyls in Virgil's Aeneid

DrinksWages

Elderton and Pearson's (1910) data on drinking and wages

EdgeworthDeaths

Edgeworth's Data on Death Rates in British Counties

Fingerprints

Waite's data on Patterns in Fingerprints

Galton

Galton's data on the heights of parents and their children

GaltonFamilies

Galton's data on the heights of parents and their children, by family

Guerry

Data from A.-M. Guerry, "Essay on the Moral Statistics of France"

HalleyLifeTable

Halley's Life Table

Jevons

W. Stanley Jevons' data on numerical discrimination

Langren

van Langren's data on longitude distance between Toledo and Rome

Macdonell

Macdonell's data on height and finger length of criminals, used by Gosset (1908)

Mayer

Mayer's data on the libration of the moon

Michelson

Michelson's 1879 determinations of the velocity of light

Minard

Data from Minard's famous graphic map of Napoleon's march on Moscow

Nightingale

Florence Nightingale's data on deaths from various causes in the Crimean War

OldMaps

Latitudes and Longitudes of 39 Points in 11 Old Maps

PearsonLee

Pearson and Lee's 1896 data on the heights of parents and children classified by gender

PolioTrials

Polio Field Trials Data on the Salk vaccine

Pollen

5D dataset from the 1986 JSM Challenge

Prostitutes

Parent-Duchatelet's time-series data on the number of prostitutes in Paris

Pyx

Trial of the Pyx

Quarrels

Statistics of Deadly Quarrels

Saturn

Laplace's Saturn data

Snow

John Snow's map and data on the 1854 London Cholera outbreak

Virginis

J. F. W. Herschel's data on the orbit of the twin star gamma Virginis

Wheat

Playfair's data on wages and the price of wheat

Yeast

Student's (1906) Yeast Cell Counts

ZeaMays

Darwin's Heights of Cross- and Self-fertilized Zea May Pairs

Author(s)

Michael Friendly

Maintainer: Michael Friendly <friendly@yorku.ca>

References

Friendly, M. (2007). A Brief History of Data Visualization. In Chen, C., Hardle, W. & Unwin, A. (eds.) Handbook of Computational Statistics: Data Visualization, Springer-Verlag, III, Ch. 1, 1-34.

Friendly, M. & Denis, D. (2001). Milestones in the history of thematic cartography, statistical graphics, and data visualization. http://datavis.ca/milestones/

Friendly, M. & Denis, D. (2005). The early origins and development of the scatterplot. Journal of the History of the Behavioral Sciences, 41, 103-130.

Friendly, M. & Sigal, M. & Harnanansingh, D. (2016). "The Milestones Project: A Database for the History of Data Visualization," In Kostelnick, C. & Kimball, M. (ed.), Visible Numbers: The History of Data Visualization, Ashgate Press, Chapter 10.

Friendly, M. & Wainer, H. (2021). A History of Data Visualization and Graphic Communication. Harvard University Press. Book: https://www.hup.harvard.edu/catalog.php?isbn=9780674975231, Web site: https://friendly.github.io/HistDataVis/.

See Also

Arbuthnot, Armada, Bowley, Cavendish, ChestSizes, Cholera, CholeraDeaths1849, CushnyPeebles,

Dactyl, DrinksWages, EdgeworthDeaths, Fingerprints, Galton, GaltonFamilies, Guerry, HalleyLifeTable,

Jevons, Langren, Macdonell, Michelson, Minard, Nightingale,

OldMaps, PearsonLee, PolioTrials, Pollen, Prostitutes, Pyx,

Quarrels, Snow, Wheat, Yeast, ZeaMays

Other packages containing data sets of historical interest include:

The Guerry-package, containing maps and other data sets related to Guerry's (1833) Moral Statistics of France.

morsecodes from the (defunct) xgobi package for data from Rothkopf (1957) on errors in learning Morse code, a classical example for MDS.

The psych package, containing Galton's peas data. The same data set is contained in alr4 as galtonpeas.

The agridat contains a large number of data sets of agricultural data, including some extra data sets related to the classical barley data (immer and barley) from Immer (1934): minnesota.barley.yield, minnesota.barley.weather.

Examples

# see examples for the separate data sets

HistData documentation built on Aug. 10, 2023, 1:08 a.m.