db.read: Read a set of text files forming a data base

View source: R/db.read.r

db.readR Documentation

Read a set of text files forming a data base

Description

Reads a set of delimited text files supposed to form a relational data base

Usage

db.read(dir = ".", ext = "tsv", sep = "\t", ...)

Arguments

dir

Directory containing the text files (character string). Will be passed to argument path of list.files.

ext

A character string representing a file extension used to identify the files to be read. This is typically a string of three characters like, e.g., 'csv' or 'tsv'. A preceding dot is implicitly assumed to be present and must be omitted.

sep

The field delimiter (character) for use with read.table.

...

Further optional arguments passed to read.table.

Value

A list, each element of which is a data frame. Element names are constructed from the source file names by stripping the specified extension and the preceeding dot.

Note

The text files are read with read.table using the fixed arguments header=TRUE and stringsAsFactors=FALSE. Thus, one should not try to overwrite these settings using the ... argument. It is possible to set other optional arguments of read.table like skip or encoding.

Author(s)

David Kneis david.kneis@tu-dresden.de

See Also

After reading the set of files, one typically wants to check the data base for integrity using the functions check.notnull, check.unique, check.key, and check.link. It is probably good style to wrap all necessary checks into a single dedicated function that can be called repeatedly (e.g. after manipulation of data) or which can be re-used for other data bases of the same layout. See example below.

Examples


# Read example DB shipped with the package
db <- db.read(dir=system.file("examples", package="tabular"), ext="tsv")
print(names(db))

# Integrity checks wrapped into a dedicated function
validate <- function(db) {
  with(db, {
    stopifnot(check.key(samples, c("date","id_location")))
    stopifnot(check.link(samples, "id_location", locations, "id"))
    # further checks would go here ...
 })
}

validate(db)

dkneis/tabular documentation built on Feb. 9, 2023, 12:34 a.m.