dspl: Builds Dataset Publication Language (DSPL) metadata file

Description Usage Arguments Details Value Author(s) References Examples

Description

Parsing csv, tab or xls(x) files at a specific directory path, dspl generates a complete DSPL file. If an output string is specified, the function generates the complete ZIP (DSPL file plus csv files) ready to be uploaded to Google Public Data Explorer.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
dspl(path, output = NA, replace = F, targetNamespace = "",
  timeFormat = "yyyy", lang = c("es", "en"), name = NA,
  description = NA, url = NA, providerName = NA, providerURL = NA,
  sep = ";", dec = ".", encoding = getOption("encoding"),
  moreinfo = NULL)

new_dspl(path, output = NA, replace = F, targetNamespace = "",
  timeFormat = "yyyy", lang = c("es", "en"), name = NA,
  description = NA, url = NA, providerName = NA, providerURL = NA,
  sep = ";", dec = ".", encoding = getOption("encoding"),
  moreinfo = NULL)

Arguments

path

String. Path to the folder where the tables (csv|tab|xls) are at.

output

String, optional. Path to the output ZIP file.

replace

Logical. If output ZIP file is defined exists, dspl replaces it.

targetNamespace

String. As DSPL documentation states “Provides a URI that identifies your dataset. This URI is not required to point to an actual resource, but it's a good idea to have the URI resolve to a document describing your content or dataset”.

timeFormat

String. The corresponding time format of the collection. Should be specified accordingly to joda-time format. See the Details section for more information.

lang

A list of strings of the languages supported by the dataset. Could be only one.

name

List of strings. The name of the dataset as defined accordingly to the lang list.

description

List of strings. Description of the dataset. It also supports multiple description as the name

url

The corresponding URL for the dataset.

providerName

List of strings. The data provider name.

providerURL

List of strings. The data provider website url.

sep

The separation character of the tables in the 'path' folder. Currently supports introducing the following arguments: “,” or “;” (for .csv files), “\t” (for .tab files) and “xls” or “xlsx” (for Microsoft's excel files).

dec

String. Decimal point.

encoding

The char encoding of the input tables. Currently ignored for Microsoft excel files.

moreinfo

A special tab file generated by the function genMoreInfo that contains a dataframe of the dataset concepts with more specifications such as description, topic, url, etc.

Details

If there isn't any output defined the function returns a list of class dspl that among its contents has a xml object (DSPL file); otherwise, if an output is defined, the results consists on two things, an already ZIP file containing a all the necessary to be uploaded at publicdata.google.com (a collection of csv files and the XML DSPL written file) and a message (character object).

Internally, the parsing process consists on the following steps:

  1. Loading the data,

  2. Generating each column corresponding id,

  3. Identifying the data types,

  4. Building concepts,

  5. Identifying dimensional concepts and distinguishing between categorical, geographical and time dimensions, and

  6. Executing internal checks.

In order to properly load the zip file (DSPL file plus CSV data files), the function executes a series of internal checks upon the data structure. The detailed list:

Value

If there isn't any output defined, dspl returns list of class "dspl".

An object of class "dspl" is a list containing:

dspl

A character string containing the DSPL XML document as defined by the saveXML function.

concepts.by.table

A data frame object of concepts stored by table.

dimtabs

A data frame containing dimensional tables.

slices

A data frame of slices.

concepts

A data frame of concepts (all of them).

dimensions

A data frame of dimensional concepts.

statistics

A matrix of statistics.

otherwise the function will build a ZIP file as specified in the output containing the CSV and DSPL (XML) files.

Author(s)

George G. Vega Yon

References

Examples

1
2

googlePublicData documentation built on May 2, 2019, 3:45 a.m.