vignettes/support.md

title: Overview of support for the Datapackage Specification v2 author: Jan van der Laan css: "style.css"

Note that any property can always be obtained and set using dpproperty() and dpproperty<-() respectively. Therefore, when specific support for a property is missing from the table below, this poperty can still be obtained and set.

Data Package

PropertyGettingSetting resources dp_resource(), dp_resource_names() dp_resource<-(), cp_resources<-() $schema name dp_name() dp_name<-() id dp_id() dp_id<-() licenses title dp_title() dp_title<-() description dp_description() dp_description<-() homepage image version created dp_created() dp_created<-() keywords dp_keywords() dp_keywords<-() contributors dp_contributors() dp_contributors<-(), dp_add_contributor(), dp_new_contributor() sources

Data Resource

PropertyGettingSetting name dp_name() dp_name<-() path dp_path() dp_path<-() data dp_get_data() returns data either from 'data' property or by reading from the 'path'. It is possible to write data to file using dp_write_data(), but not to the 'data' property inside the datapackage.json type $schema title dp_title() dp_title<-() description dp_description() dp_description<-() format dp_format(), dp_get_data() will use the 'format' as the primary determinant for the reader to use to read the data. dp_format<-() mediatype dp_mediatype(), when 'format' is missing 'mediatype' will be used to determine which reader to use for reading the data by dp_get_data() dp_mediatype<-(), dp_generate_dataresource() encoding dp_encoding() dp_encoding<-() bytes dp_bytes()a dp_bytes<-()a hash dp_hash()a dp_hash<-()a sources licences

a The number of bytes and the hash can be set and get. There is no functionality to check is the file indeed has the specified number of bytes or hash and/or to automatically calculate this from the given file(s).

Tabular Data Resource

PropertyGettingSetting dialect See 'Table Dialect'. There is no function to specifically get the 'dialect' information. The data resource is passed to the reader functions that will access this information. See 'Table Dialect'. The writer functions will use this information when writing. There is no specific function to change this information. By default the, safe, default values will be used. schema dp_schema(), also see 'Table Schema' `dp_generate_dataresource()` will generate appropiate schema for a given data set.

Table Dialect

As mentioned above, the 'dialect' property cannot be directly set. The table below indicates what properties are recognised when reading and writing data. The items are marked as support or not or irrelevant based on the support by the CSV reader and writer.

PropertyReadingWriting $schema header CSV CSV headerRows headerJoin commentRows commentChar CSV CSV delimiter CSV CSV lineTerminator CSVa CSVa quoteChar CSVb CSVb doubleQuote CSVc CSVc escapeChar nullSequence CSV CSV skipInitialSpace CSV CSV property itemType itemKeys sheetNumber sheetName table

a Only \n/\r or \r\n is accepted.

b Only '"' is accepted.

c Only 'true' is accepted.

Table Schema

PropertyGettingSetting $schema fields dp_field(), dp_field_names() fieldsMatch Used and checked by dp_apply_schema() and dp_check_dataresource(). missingValues primaryKeys uniqueKeys foreignKeys

Field Descriptor

PropertyGettingSetting name dp_name(); also used by dp_get_data(). dp_name<-() type dp_type(); also used by dp_get_data(). dp_type<-() format dp_format() dp_format<-() title dp_title() dp_title<-() description dp_description() dp_description<-() example constraints Used by dp_check_field() and dp_check_dataresource(); see 'Field Constraints'. categories Used by dp_categorieslist() and dp_get_data(). Used by dp_write_data() and dp_generate_dataresource(). categoriesOrdered missingValues Used by dp_get_data(). refType

Field Types

As mentioned above, the field descriptors cannot be directly modified or read from. The table below indicates what properties are recognised when reading and writing data. The items are marked as support or not or irrelevant based on the support by the CSV reader and writer.

When a type is not supported the data will be read as a character string.

PropertyReadingWriting string CSV CSV format number CSV CSV NaN, INF, -INF CSV CSV exponent CSV CSV decimalChar CSV CSV groupChar CSV CSV bareNumber CSV CSV integer CSV CSV groupChar CSV CSV bareNumber CSV CSV boolean CSV CSV trueValues CSV CSV falseValues CSV CSV object array list delimiter itemType datetime CSV CSV format ("default", "", "any") CSV CSV date CSV CSV format ("default", "", "any") CSV CSV time CSV CSV format ("default", "", "any") CSV CSV year CSV CSV yearmonth CSV CSV duration geopoint format ("default", "array", "object") geojson format ("default", "topojson") any

Field Constraints

The functions dp_check_dataresource() and dp_check_field() checks if a given data.frame or vector is valid given the Data Resource or Field Descriptor. By default these will also check any constraints of fields. The default CSV and fixed width readers will not run these checks.

PropertyChecking constraintsGettingSetting required dp_check_field(), dp_check_dataresource(..., constraints = TRUE) unique dp_check_field(), dp_check_dataresource(..., constraints = TRUE) minLength maxLength minimum dp_check_field(), dp_check_dataresource(..., constraints = TRUE) maximum dp_check_field(), dp_check_dataresource(..., constraints = TRUE) exclusiveMinimum dp_check_field(), dp_check_dataresource(..., constraints = TRUE) exclusiveMaximum dp_check_field(), dp_check_dataresource(..., constraints = TRUE) jsonSchema pattern enum dp_check_field(), dp_check_dataresource(..., constraints = TRUE)

djvanderlaan/datapackage documentation built on June 12, 2025, 2:44 a.m.