knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE,
  comment = "##",
  fig.path = "man/figures/README-",
  fig.height = 5,
  fig.width = 5
#  out.width = "100%"
)

library(HistData)

Lifecycle: stable CRAN_Status_Badge DOI

HistData

Data Sets from the History of Statistics and Data Visualization

Dev. Version: 0.9-2

The HistData package provides a collection of small data sets that are interesting and important in the history of statistics and data visualization. The goal of the package is to make these available, both for instructional use (as examples, problem sets or projects) and for historical research (extending or criticizing a previous analysis). Some of these present interesting challenges, or opportunities to "show off", with graphics or analysis in R.

Many of the data sets have examples which reproduce an historical graph or analysis. These are meant mainly as starters for more extensive re-analysis or graphical elaboration. If you are interested in any of these problems or data sets, I've purposely left lots of room to do better!

They are part of a program of research called statistical historiography (Friendly, 2007; Friendly & Denis, 2001; Friendly et-al, 2016) meaning the use of statistical methods to study problems and questions in the history of statistics and graphics. A main aspect of this is the increased understanding of historical problems in science and data analysis trough the process of trying to reproduce a graph or analysis using modern methods. I call this "Re-visioning", meaning to see again, hopefully in a new light.

They are also used in our book, A History of Data Visualization & Graphic Communication (Friendly & Wainer, 2021). See also the companion website for this book.

Data science

There is another R aspect that should be noted here: A great deal of "data sciency" work was involved in constructing this package, alas (for teaching) not captured in the resulting CRAN-friendly package.

Installation

Get the released version from CRAN:

install.packages("HistData")

The development version can be installed to your R library directly from github via:

remotes::install_github("friendly/HistData")

Data sets

Here are the data sets in the package. Some topics are represented by two or more data sets.

vcdExtra::datasets("HistData") |> dplyr::select(Item, Title) |> knitr::kable()

Contributors

Please note that the HistData project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Over the years, many people have contributed new data sets, offered corrections, suggestions, or documentation examples. They are appreciatedly listed below:

David Bellhouse, Brian Clair, Stephane Dray, Luiz Droubi, Antoine de Falguerolles, Monique Graf, James Hanley, Peter Li, Dennis Murphy, Jim Oeppen, James Riley, Neville Verlander, Hadley Wickham.

References

Friendly, M. (2007). A Brief History of Data Visualization. In Chen, C., Hardle, W. & Unwin, A. (eds.)
Handbook of Computational Statistics: Data Visualization, Springer-Verlag, III, Ch. 1, 1-34. Preprint

Friendly, M. & Denis, D. (2001). Milestones in the history of thematic cartography, statistical graphics, and data visualization. Web stite: http://datavis.ca/milestones/

Friendly, M. & Sigal, M. & Harnanansingh, D. (2016). "The Milestones Project: A Database for the History of Data Visualization," In Kostelnick, C. & Kimball, M. (ed.), Visible Numbers: The History of Data Visualization, Ashgate Press, Chapter 10. Preprint

Friendly, M. & Wainer, H. (2021). A History of Data Visualization and Graphic Communication. Harvard University Press. Companion web site



friendly/HistData documentation built on April 30, 2024, 7:14 p.m.