readDbR: Read database table using R

Description Usage Arguments Details Value Note Author(s) References See Also

View source: R/readDbR.R

Description

Read a database which follows the Darwin Core Standard [1].

Usage

1
2
3
readDbR(data = NULL, path.data = NULL, cut.col = c(1, 78, 79, 200, 218,
  219), delt.undeterm = TRUE, save.name = NULL, wrt.frmt = "saveRDS",
  save.in = NULL)

Arguments

data

Vector of characters. Name of the input file.

path.data

Vector of characters. Path to the input file.

cut.col

Numeric vector. Columns number to read into database. By default, the columns c(1,78,79,200,218,219) are read. These correspond to headers of the Darwin Core standard [1] : gbifID, decimalLongitude, decimalLatitude, elevation, speciesKey and species. See details.

delt.undeterm

Logical vector. If it is 'TRUE' return a data table with only occurrences that have taxonomic determination until species. Otherwise, it could return all occurrences read into database.

save.name

Vector of characters. Name of the output file.

wrt.frmt

Vector of characters. Format to save output file. By default it will be written as a R object using the 'saveRDS' argument, but it can be saved as plain text using the 'saveTXT' argument. See details.

save.in

Vector od characters. Path to the output file.

Details

We recommend to use this function when the database have fewer than one hundred thousand occurrences. This function works on R platform and can be performed on any operative system (Linux, Mac OS or Windows). If the database to read has more than one hundred thousand occurrences, we recommend to use the readDbBash function. readDbBash uses the cut function from BASH programming language and can be functional on Linux or iOS operative systems, but the readDbBash function always will be faster than readDbR (until four times faster).

Databases downloaded from Global Biodiversity Information Facility (GBIF)[2] are exported with DarwinCore headers and the column separator is TAB.

See readAndWrite function.

For cut.col parameter, the numbers columns to split must be sorted sequentially. For database download from GBIF [2], the number for each header can be seem using data('ID_DarwinCore) command on console in the ID colunm.

For more details about the formats to read and/or write, see readAndWrite function.

Value

writing a data table as a data.frame class and a vector a table with descriptive statistics.

Note

See: R-Alarcon V. and Miranda-Esquivel DR.(submitted) geocleaMT: An R package to cleaning geographical data from electronic biodatabases.

Author(s)

R-Alarcon Viviana and Miranda-Esquivel Daniel R.

References

[1] Wieczorek, J. et al. 2012. Darwin core: An evolving community-developed biodiversity data standard. PloS One 7: e29715.

[2] Global Biodiversity Information Facility. Available online at http://www.gbif.org/.

See Also

readDbR

readAndWrite


Dmirandae/geocleaMT-1 documentation built on Nov. 18, 2019, 6:26 p.m.