knitr::opts_chunk$set( collapse = TRUE, eval=FALSE, comment = "#>" )
library(dataset)
In this case, we will create the dataset from a file. We add the mandatory Title
attribute to the dataset, and a mandatory data frame identifier dataset_id
. Because the iris
dataset does not have a clear identifier, the dataset()
constructor will use the row names as unique observation identifiers. We do not define a dimension
column, and apart from the Species
attribute, we define all variables as measurements
.
In a tidy dataset, we only apply one unit of measure. In this case this is millimeters. We add this as an optional parameter to the dataset
.
The .f
parameter names the function that will read in the dataset. Any function can be used, but it must be installed on the user's computer. The ...
ellipsis must contain all parameters that are necessary to make the call to the file reading function successful. In this case, we will use a singe parameter, the path to the file.
temp_path <- file.path(tempdir(), "iris.csv") write.csv(iris, file = temp_path, row.names = F) file.exists(temp_path) iris_ds <- read_dataset( dataset_id = "iris_dataset", obs_id = NULL, dimensions = NULL, measures = c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width"), attributes = "Species", Title = "Iris Dataset", unit = list(code = "MM", label = "millimeter"), .f = "utils::read.csv", file = temp_path )
That attributes of the dataset contain the creation date and the function call that created the dataset. The related items contain the bibliographic reference to the creator function. Because we used a local file, the entire path is recorded for reproducibility. A similar function will be defined for internet URLs.
print(attributes(iris_ds))
Let's apply a very simple function (method) on this dataset, the summary function (in this case, the summary method of the base R data.frame class.)
iris_summary <- use_function(dataset = iris_ds, .f = "summary")
The new dataset, iris_summary
shows that it was created by a read.csv
call and them the summary
method created its current form and contents.
knitr::kable(attr(iris_summary, "Date"))
The dataset
class maintains the entire processing history and all the bibliographic references of the functions that contributed to its creation. [There version will be recorded, too.] This creates the full reproducibility of the dataset
class.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.