dataset: Structure a data frame to dataset

View source: R/dataset.R

datasetR Documentation

Structure a data frame to dataset

Description

A DataSet is a collection of statistical data that corresponds to a defined structure.

Usage

dataset(
  x,
  Dimensions = NULL,
  Measures = NULL,
  Attributes = NULL,
  sdmx_attributes = NULL,
  Title = NULL,
  Label = NULL,
  Creator = NULL,
  Publisher = NULL,
  Issued = NULL,
  Identifier = NULL,
  Subject = NULL,
  Type = "DCMITYPE:Dataset"
)

is.dataset(x)

as.data.frame(x, ...)

## S3 method for class 'dataset'
as.data.frame(x, ...)

## S3 method for class 'dataset'
subset(x, ...)

## S3 method for class 'dataset'
x[i, j, ...]

## S3 method for class 'dataset'
summary(object, ...)

## S3 method for class 'dataset'
print(x, ...)

Arguments

x

A data.frame or inherited tibble, data.frame, or a structured list.

Dimensions

The name or column number of the dimensions within the dataset.

Measures

The name or column number of the measures within the dataset.

Attributes

The name or column number of the attributes within the dataset.

sdmx_attributes

The optional dimensions and attributes that conform with SDMX. c("time", "geo") will mark the "time" and "geo" attributes as conforming to sdmx. See sdmx-attribute.

Title

dct:title, a name given to the resource. datacite allows the use of alternate titles, too. See dataset_title.

Label

may be used to provide a human-readable version of the dataset's name. A text description (optionally with a language tag) as defined by rdfs:label.

Creator

An entity primarily responsible for making the resource. dct:creator Corresponds to Creator in datacite. See creator.

Publisher

Corresponds to dct:publisher and Publisher in DataCite. The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource. This property will be used to formulate the citation, so consider the prominence of the role. For software, use Publisher for the code repository. If there is an entity other than a code repository, that "holds, archives, publishes, prints, distributes, releases, issues, or produces" the code, use the property Contributor/contributorType/hostingInstitution for the code repository. See publisher.

Issued

Corresponds to dct:date.

Identifier

An unambiguous reference to the resource within a given context. Recommended practice is to identify the resource by means of a string conforming to an identification system. Examples include International Standard Book Number (ISBN), Digital Object Identifier (DOI), and Uniform Resource Name (URN). Select and identifier scheme from registered URI schemes maintained by IANA. More details: Guidelines for using resource identifiers in Dublin Core metadata and IEEE LOM. Similar to Identifier in datacite. See identifier.

Subject

Recommended for discovery in DataCite. Subject, keyword, classification code, or key phrase describing the resource. Similar to dct:subject.
Use subject to properly add a key phrase from a controlled vocabulary and create structured Subject objects with subject_create.

Type

It is set by default to DCMITYPE:Dataset.

...

Other parameters for the print, summary and as.data.frame methods.

i

elements to extract or replace: numeric, character, empty or logical.

j

elements to extract or replace: numeric, character, empty or logical.

object

an object for which a summary is desired.

Details

Loosely follows the The RDF Data Cube Vocabulary, but without the definition of data slices.
bibentry_dataset is a wrapper around bibentry to correctly turn the metadata of the dataset into a bibentry object.
as.data.frame coerces a dataset into a data.frame in a way that the metadata attributes are retained.

Value

A data frame-like object with structural and referential metadata.

See Also

iris_dataset

Other dataset functions: dataset_local_id(), dataset_uri()

Examples

my_dataset <- dataset (
    x = data.frame (time = rep(c(2019:2022),2),
                    geo = c(rep("NL",4), rep("BE",4)),
                    value = c(1,3,2,4,2,3,1,5),
                    unit = rep("NR",8),
                    freq = rep("A",8)),
    Dimensions = c(1,2),
    Measures = 3,
    Attributes = c(4,5),
    sdmx_attributes = c("time", "freq"),
    Title = "Example dataset",
    Creator = person("Jane", "Doe"),
    Publisher = "Publishing Co.",
    Issued = as.Date("2022-07-14")
)
## iris_dataset is a dataset class version of iris
as.data.frame(iris_dataset)

dataset documentation built on March 31, 2023, 10:24 p.m.