knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of dataobservatory is to facilitate the automated documentation, and the automated recording of descriptive and administrative (statistical processing) metadata for datasets. It also helps recording information about the computational environment to increase reproducability.
You can install the development version from GitHub with:
# install.packages("devtools") devtools::install_github("dataobservatory-eu/dataobservatory")
The dataset
S3 class is an extension of the data frame and tibble class. It has some important metadata attributes that facilitate the automated documentation of the dataset. Furthermore, it has an adequate print
and summary
method.
library(dataobservatory) data("small_population") small_population_dataset <- dataset (x= small_population, dataset_code = "small_population_total", dataset_title = "Population of Small European Countries", freq = "A", unit = "NR", unit_name = "number")
small_population_dataset
summary(small_population_dataset)
small_population_datacite <- datacite_dataset( dataset = small_population_dataset, Subject = "Demography", Creator = "Joe, Doe")
The datacite
class (see ?datacite()
) is a modification of a data frame (tibble) object, and it creates all the mandatory and recommended fields of the DataCite metadata schema for a dataset. It also covers all the properties in the more general Dublin Core standard, but in some cases, the property name is different (and follows the DataCite naming convention.)
The descriptive metadata can be added with the datacite()
constructor (see: ?datacite
) or the datacite_dataset()
helper function. or read the DataCite Descriptive Metadata vignette article.
The datacite
class can automatically connected to many scientific repositories, including Zenodo. In later versions, this will enable the user to upload the new created dataset (version) and receive a digital object identifier (version), or DOI(version) for the dataset.
See more about the metadata concepts applied in the FAIR Data and the Added Value of Rich Metadata chatper of the Automated Observatory Contributors’ Handbook.
print(small_population_datacite)
The statistical processing information can be added with the not fully implemented codebook
class. Read the The codebook Class vignette article.
The codebook
S3 class (not yet fully documented and does not have yet and independent constructor) records the statistical processing metadata of a dataset.
It contains a full codebook following SDMX statistical metadata codelist standards, furthermore, it records the Session Information of all processing steps, and adds to the descriptive metadata the R packages or software code that generated the results.
For example, the annual observations follow the SDMX Code List for Frequency 2.1 (CL_FREQ)) definition, and they can be translated to the ISO 8106
time metadata standard, too.
add_frequency("A", "list")
add_sessioninfo()
Please note that the dataobservatory
project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.