dataset_df: Create a new 'dataset_df' object

View source: R/dataset_df.R

dataset_dfR Documentation

Create a new dataset_df object

Description

The dataset_df() constructor creates semantically rich modern data frames. These inherit from tibble::tibble and carry structured metadata using attributes.

Usage

dataset_df(
  ...,
  identifier = c(obs = "http://example.com/dataset#obs"),
  var_labels = NULL,
  units = NULL,
  concepts = NULL,
  dataset_bibentry = NULL,
  dataset_subject = NULL
)

as_dataset_df(
  df,
  identifier = c(obs = "http://example.com/dataset#obs"),
  var_labels = NULL,
  units = NULL,
  concepts = NULL,
  dataset_bibentry = NULL,
  dataset_subject = NULL,
  ...
)

is.dataset_df(x)

## S3 method for class 'dataset_df'
print(x, ...)

is_dataset_df(x)

Arguments

...

Vectors (columns) that should be included in the dataset.

identifier

A named vector of one or more URI prefixes for row IDs. Defaults to c(eg = "http://example.com/dataset#"). For example, if your dataset will be published under DOI ⁠https://doi.org/1234⁠, you may use c(obs = "https://doi.org/1234#"), which will generate row URIs such as ⁠https://doi.org/1234#1⁠, ..., ⁠#n⁠.

var_labels

A named list of human-readable labels for each variable.

units

A named list of measurement units for measured variables.

concepts

A named list of linked concepts (URIs) for variables or dimensions.

dataset_bibentry

A bibliographic metadata record for the dataset, created using datacite() or dublincore().

dataset_subject

A subject descriptor created with subject() or subject_create().

df

A data.frame to convert to a dataset_df.

x

A dataset_df object (used in method dispatch).

Details

Use is.dataset_df() to check class membership.

S3 methods for dataset_df include:

  • print() to display the dataset with metadata

  • summary() to summarize both data and metadata

For full details, see vignette("dataset_df", package = "dataset").

Value

A dataset_df object: a tibble with attached metadata stored in attributes.

is.dataset_df returns a logical value (if the object is of class dataset_df.)

Note

A simple, serverless scaffolding for publishing dataset_df objects on the web (with HTML + RDF exports) is available at https://github.com/dataobservatory-eu/dataset-template.

See Also

defined(), dublincore(), datacite(), subject()

Examples

my_dataset <- dataset_df(
  country_name = defined(
    c("AD", "LI"),
    concept = "http://data.europa.eu/bna/c_6c2bb82d",
    namespace = "https://www.geonames.org/countries/$1/"
  ),
  gdp = defined(
    c(3897, 7365),
    label = "Gross Domestic Product",
    unit = "million dollars",
    concept = "http://data.europa.eu/83i/aa/GDP"
  ),
  identifier = c(
    obs = "https://dataobservatory-eu.github.io/dataset-template#"
  ),
  dataset_bibentry = dublincore(
    title = "GDP of Andorra and Liechtenstein",
    description = "A small but semantically rich dataset example.",
    creator = person("Jane", "Doe", role = "cre"),
    publisher = "Open Data Institute",
    language = "en"
  )
)

# Basic usage
print(my_dataset)
head(my_dataset)
summary(my_dataset)

# Metadata access
as_dublincore(my_dataset)
as_datacite(my_dataset)

# Export description as RDF triples
my_description <- describe(my_dataset, con = tempfile())
my_description


dataset documentation built on Nov. 16, 2025, 5:06 p.m.