knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Table Dialect (previously called CSV dialect) is a simple format to describe the dialect of a tabular data file, including its delimiter, header rows, escape characters, etc.

::: {.callout-info} In this document we use the terms "package" for Data Package, "resource" for Data Resource, "dialect" for Table Dialect, and "schema" for Table Schema. :::

General implementation

Frictionless supports most dialect properties to read Tabular Data Resources. Dialect manipulation is limited to setting a delimiter. When writing resources, it (mainly) makes uses of default dialect properties, removing the necessity to define them.

Read

read_resource() uses the dialect property of a resource to parse a tabular data file. Only properties that deviate from the default need to be specified. E.g. a tab-delimited file without header rows must have the following dialect:

"dialect": {
  "delimiter": "\t",
  "header": false
}

Manipulate

Frictionless does not support direct manipulation of the dialect. add_resource() allows to set one property (dialect$delimiter) when data are provided as a file, all other properties are assumed to be the default.

Write

write_package() writes a package to disk as a datapackage.json file. This file includes the metadata of all the resources, including the dialect (if defined). write_package() writes resources created from a data frame to CSV files, but no dialect property is set for those, since only defaults are used.

Properties implementation

delimiter

delimiter is used by read_resource() and defaults to ",". It is passed to delim in readr::read_delim(). add_resource() does not set delimiter, unless provided in delim and different from the default ",":

library(frictionless)
package <- example_package()

path <- system.file("extdata", "v1", "observations_1.tsv", package = "frictionless")
package <- add_resource(package, "observations", data = path, delim = "\t", replace = TRUE)
package$resources[[2]]$dialect$delimiter

lineTerminator

lineTerminator is ignored by read_resource(). It relies on readr::read_delim() instead, which interprets line terminator LF and CRLF automatically and does not support CR (used by Classic Mac OS, final release 2001).

quoteChar

quoteChar is used by read_resource() and defaults to ". It is passed to quote in readr::read_delim().

doubleQuote

doubleQuote is used by read_resource() and defaults to true, but can be overruled by escapeChar. It is passed to escape_double in readr::read_delim().

escapeChar

escapeChar is ignored by read_resource() unless it is "\\". It is passed as escape_backslash = TRUE and escape_double = FALSE in readr::read_delim().

::: {.callout-warning} escapeChar and doubleQuote are mutually exclusive, so you cannot escape with \" and "" in the same file. :::

nullSequence

nullSequence is ignored by read_resource(). Provide as missingValues in the schema instead (see vignette("table-schema")).

skipInitialSpace

skipInitialSpace is used by read_resource() and defaults to false. It is passed to trim_ws in readr::read_delim().

header

header is used by read_resource() and defaults to true. It is passed as trim_ws = 1 (or 0) in readr::read_delim().

commentChar

commentChar is used by read_resource() and defaults to undefined. It is passed to comment in readr::read_delim().

caseSensitiveHeader

caseSensitiveHeader is ignored by read_resource().

csvddfVersion

csvddfVersion is ignored by read_resource().



frictionlessdata/frictionless-r documentation built on April 17, 2025, 11:45 a.m.