CSVY is a file format that combines the simplicity of CSV (comma-separated values) with the metadata of other plain text and binary formats (JSON, XML, Stata, etc.). The CSVY file specification is simple: place a YAML header on top of a regular CSV. The yaml header is formatted according to the Table Schema of a Tabular Data Package.
A CSVY file looks like this:
#--- #profile: tabular-data-resource #name: my-dataset #path: https://raw.githubusercontent.com/csvy/csvy.github.io/master/examples/example.csvy #title: Example file of csvy #description: Show a csvy sample file. #format: csvy #mediatype: text/vnd.yaml #encoding: utf-8 #schema: # fields: # - name: var1 # type: string # - name: var2 # type: integer # - name: var3 # type: number #dialect: # csvddfVersion: 1.0 # delimiter: "," # doubleQuote: false # lineTerminator: "\r\n" # quoteChar: "\"" # skipInitialSpace: true # header: true #sources: #- title: The csvy specifications # path: http://csvy.org/ # email: '' #licenses: #- name: CC-BY-4.0 # title: Creative Commons Attribution 4.0 # path: https://creativecommons.org/licenses/by/4.0/ #--- var1,var2,var3 A,1,2.0 B,3,4.3
Which we can read into R like this:
library("csvy") str(read_csvy(system.file("examples", "example1.csvy", package = "csvy")))
Optional comment characters on the YAML lines make the data readable with any standard CSV parser while retaining the ability to import and export variable- and file-level metadata. The CSVY specification does not use these, but the csvy package for R does so that you (and other users) can continue to rely on utils::read.csv()
or readr::read_csv()
as usual. The import()
function in rio supports CSVY natively.
To create a CSVY file from R, just do:
library("csvy") library("datasets") write_csvy(iris, "iris.csvy")
It is also possible to export the metadata to separate YAML or JSON file (and then also possible to import from those separate files) by specifying the metadata
field in write_csvy()
and read_csvy()
.
To read a CSVY into R, just do:
d1 <- read_csvy("iris.csvy") str(d1)
or use any other appropriate data import function to ignore the YAML metadata:
d2 <- utils::read.table("iris.csvy", sep = ",", header = TRUE) str(d2)
unlink("iris.csvy")
The package is available on CRAN and can be installed directly in R using:
install.packages("csvy")
The latest development version on GitHub can be installed using devtools:
if(!require("remotes")){ install.packages("remotes") } remotes::install_github("leeper/csvy")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.