knitr::opts_chunk$set( collapse = FALSE, comment = "" ) # https://community.rstudio.com/t/showing-only-the-first-few-lines-of-the-results-of-a-code-chunk/6963/2 # hook_output <- knitr::knit_hooks$get("output") knitr::knit_hooks$set( output = function(x, options) { lines <- options$output.lines if (is.null(lines)) { return(hook_output(x, options)) # pass to default hook } x <- unlist(strsplit(x, "\n")) more <- "..." if (length(lines)==1) { # first n lines if (length(x) > lines) { # truncate the output, but add .... x <- c(head(x, lines), more) } } else { x <- c(more, x[lines], more) } # paste these lines together x <- paste(c(x, ""), collapse = "\n") hook_output(x, options) } )
library("steward")
The goal of this package is to make it a little easier to build and publish data-dictionaries, in particular:
In this article, we will look at a few (what we think are) common tasks:
For this set of examples, we use the ggplot2 diamonds
dataset. We will create an stw_dataset
, which is a data frame with an extra class attached to help manage metadata. An stw_dataset
is a data frame, in the same way that a tibble is a data frame.
In the code that follows we have three steps:
stw_dataset()
using a data frame.stw_mutate_meta()
.stw_mutate_dict()
.The functions stw_mutate_meta()
and stw_mutate_dict()
work a little bit like dplyr::mutate()
. However, instead of mutating the columns of a data frame, stw_mutate_meta()
mutates the top-level metadata, and stw_mutate_dict()
mutates the descriptions of columns in a data frame:
diamonds_new <- stw_dataset(ggplot2::diamonds) %>% stw_mutate_meta( name = "diamonds", title = "Prices of 50,000 round cut diamonds", description = "A dataset containing the prices and other attributes of almost 54,000 diamonds.", sources = list( list( title = "DiamondSearchEngine", path = "http://www.diamondse.info/" ) ) ) %>% stw_mutate_dict( price = "price in US dollars ($326--$18,823)", carat = "weight of diamond (0.2--5.01)", cut = "quality of the cut (Fair, Good, Very Good, Premium, Ideal)", color = "diamond color, from D (best) to J (worst)", clarity = "a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))", x = "length in mm (0--10.74)", y = "width in mm (0--58.9)", z = "depth in mm (0--31.8)", depth = "total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79)", table = "width of top of diamond relative to widest point (43--95)" )
We also offer a function to make some basic checks on the completeness of the metadata:
stw_check(diamonds_new, verbosity = "all")
If you are including a dataset in a package, it is good practice to document it; in fact, CRAN insists!
You can use an stw_dataset
to keep your documentation associated with your dataset as you build it for your package.
By keeping all the "stuff" together, this helps assure that the dataset documentation is complete and current.
You can use the function stw_use_data()
wraps usethis::use_data()
and steward::stw_write_roxygen()
. For example, by running:
stw_use_data(diamonds_new)
You will:
diamonds_new
dataset to your package.R/data-diamonds_new.R
.If you are creating an RMarkdown document, you can use the stw_to_table()
function to create a gt table:
stw_to_table(diamonds_new)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.