knitr::opts_chunk$set( message = FALSE, warning = FALSE, fig.height=5, fig.width=5, # results='hide', # fig.keep='none', fig.path='fig/datasets-', echo=TRUE, collapse = TRUE, comment = "#>" )
set.seed(1071) options(width=80, digits=5, continue=" ") library(heplots) library(candisc) library(ggplot2) library(dplyr)
Datasets used in package examples are such an important part of making a package understandable and usable, but is often overlooked.
In developing the heplots
package I collected a large collection of data sets illustrating a
variety of multivariate linear models with some an analyses, and graphical displays. Each of these have much more than the
usual stub examples, that often look like:
data(dataset) # str(dataset); plot(dataset)
But .Rd
, and now roxygen
, don't make it easy to work with numerous datasets in a package, or, more importantly, to document what they illustrate. I'm showing the work to create this vignette, in case these ideas are useful to others.
In this release, I started with a file generated by:
vcdExtra::datasets("heplots") |> head(4)
Then, in the roxygen documentation, I added @concept
tags to classify these datasets according to methods used.
(@concept
entries are indexed with the package, so they work via help.search()
)
For example,
the documentation for the AddHealth
data contains these lines:
#' @name AddHealth #' @docType data ... #' @keywords datasets #' @concept MANOVA #' @concept ordered
With standard
processing, these concepts along with the keywords, appear in the Index section of the manual constructed by devtools::build_manual()
. In the pkgdown
site for this package, they are also searchable in the search box.
With a bit of extra processing, I created a dataset datasets.csv used below.
The main methods used in the example datasets are shown in the table below:
In addition, a few examples illustrate special handling for linear hypotheses concerning factors:
The dataset names are linked to the documentation with graphical output on the
pkgdown
website, [http://friendly.github.io/heplots/].
library(here) library(dplyr) library(tinytable) #dsets <- read.csv(here::here("extra", "datasets.csv")) # doesn't work in a vignette dsets <- read.csv("https://raw.githubusercontent.com/friendly/heplots/master/extra/datasets.csv") dsets <- dsets |> dplyr::select(-X) |> arrange(tolower(dataset)) # link dataset to pkgdown doc refurl <- "http://friendly.github.io/heplots/reference/" dsets <- dsets |> mutate(dataset = glue::glue("[{dataset}]({refurl}{dataset}.html)")) #knitr::kable(dsets) tinytable::tt(dsets) |> format_tt(markdown = TRUE)
This table can be inverted to list the datasets that illustrate each concept:
concepts <- dsets |> select(dataset, tags) |> tidyr::separate_longer_delim(tags, delim = " ") |> arrange(tags, dataset) |> summarize(datasets = toString(dataset), .by = tags) |> rename(concept = tags) #knitr::kable(concepts) tinytable::tt(concepts) |> format_tt(markdown = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.