knitr::opts_chunk$set( collapse = TRUE, message = FALSE, warning = FALSE, fig.height = 6, fig.width = 7, fig.path = "fig/datasets-", dev = "png", comment = "##" ) # save some typing knitr::set_alias(w = "fig.width", h = "fig.height", cap = "fig.cap")
The vcdExtra
package contains r nrow(vcdExtra::datasets("vcdExtra"))
datasets, taken
from the literature on categorical data analysis, and selected to illustrate various
methods of analysis and data display. These are in addition to the
r nrow(vcdExtra::datasets("vcd"))
datasets in the vcd package.
To make it easier to find those which illustrate a particular method, the datasets in vcdExtra
have been classified
using method tags. This vignette creates an "inverse table", listing the datasets that apply to each method.
It also illustrates a general method for classifying datasets in R packages.
library(dplyr) library(tidyr) library(readxl)
Using the result of vcdExtra::datasets(package="vcdExtra")
I created a spreadsheet, vcdExtra-datasets.xlsx
,
and then added method tags.
dsets_tagged <- read_excel(here::here("inst", "extdata", "vcdExtra-datasets.xlsx"), sheet="vcdExtra-datasets") dsets_tagged <- dsets_tagged |> dplyr::select(-Title, -dim) |> dplyr::rename(dataset = Item) head(dsets_tagged)
To invert the table, need to split tags into separate observations, then collapse the rows for the same tag.
dset_split <- dsets_tagged |> tidyr::separate_longer_delim(tags, delim = ";") |> dplyr::mutate(tag = stringr::str_trim(tags)) |> dplyr::select(-tags) #' ## collapse the rows for the same tag tag_dset <- dset_split |> arrange(tag) |> dplyr::group_by(tag) |> dplyr::summarise(datasets = paste(dataset, collapse = "; ")) |> ungroup() # get a list of the unique tags unique(tag_dset$tag)
Another sheet in the spreadsheet gives a more descriptive topic
for corresponding to each tag.
tags <- read_excel(here::here("inst", "extdata", "vcdExtra-datasets.xlsx"), sheet="tags") head(tags)
Now, join this with the tag_dset
created above.
tag_dset <- tag_dset |> dplyr::left_join(tags, by = "tag") |> dplyr::relocate(topic, .after = tag) tag_dset |> dplyr::select(-tag) |> head()
help()
We're almost there. It would be nice if the dataset names could be linked to their documentation.
This function is designed to work with the pkgdown
site. There are different ways this can be
done, but what seems to work is a link to ../reference/{dataset}.html
Unfortunately, this won't work in the actual vignette.
add_links <- function(dsets, style = c("reference", "help", "rdrr.io"), sep = "; ") { style <- match.arg(style) names <- stringr::str_split_1(dsets, sep) names <- dplyr::case_when( style == "help" ~ glue::glue("[{names}](help({names}))"), style == "reference" ~ glue::glue("[{names}](../reference/{names}.html)"), style == "rdrr.io" ~ glue::glue("[{names}](https://rdrr.io/cran/vcdExtra/man/{names}.html)") ) glue::glue_collapse(names, sep = sep) }
Use purrr::map()
to apply add_links()
to all the datasets for each tag.
(mutate(datasets = add_links(datasets))
by itself doesn't work.)
tag_dset |> dplyr::select(-tag) |> dplyr::mutate(datasets = purrr::map(datasets, add_links)) |> knitr::kable()
Voila!
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.