observation: Get 'observation' table
In samuel-rosa/febr: Data Repository of the Brazilian Soil

observation

R Documentation

Get 'observation' table

Description

Download data from the 'observation' ("observacao") table of one or more datasets published in Data Repository of the Brazilian Soil. This table includes data such as latitude, longitude, date of observation, underlying geology, land use and vegetation, local topography, soil classification, and much more.

Usage

observation(
  data.set,
  variable,
  stack = FALSE,
  missing = list(coord = "keep", time = "keep", data = "keep"),
  standardization = list(crs = NULL, time.format = NULL, units = FALSE, round = FALSE),
  harmonization = list(harmonize = FALSE, level = 2),
  progress = TRUE,
  verbose = TRUE,
  febr.repo = NULL
)

Arguments

`data.set`	Character vector indicating the identification code of one or more data sets. Use `data.set = "all"` to download all data sets.
`variable`	(optional) Character vector indicating one or more variables. Accepts only general identification codes, e.g. `"ferro"` and `"carbono"`. If missing, then a set of standard identification variables is downloaded. Use `variable = "all"` to download all variables. See ‘Details’ for more information.
`stack`	(optional) Logical value indicating if tables from different datasets should be stacked on a single table for output. Requires `standardization = list(units = TRUE)` – see below. Defaults to `stack = FALSE`, the output being a list of tables.
`missing`	(optional) List with named sub-arguments indicating what should be done with an observation missing spatial coordinates, `coord`, date of observation, `time`, or data on variables, `data`. Options are `"keep"` (default) and `"drop"`.
`standardization`	(optional) List with named sub-arguments indicating how to perform data standardization. `crs` Character string indicating the EPSG code of the coordinate reference system (CRS) to which spatial coordinates should be transformed. For example, `crs = "EPSG:4674"`, i.e. SIRGAS 2000, the standard CRS for Brazil. Defaults to `crs = NULL`, i.e. no transformation is performed. `time.format` Character string indicating how to format dates. For example, `time.format = "%d-%m-%Y"`, i.e. dd-mm-yyyy such as in 31-12-2001. Defaults to `time.format = NULL`, i.e. no formatting is performed. See `base::as.Date()` for more details. `units` Logical value indicating if the measurement unit(s) of the continuous variable(s) should be converted to the standard measurement unit(s). Defaults to `units = FALSE`, i.e. no conversion is performed. See `dictionary()` for more information. `round` Logical value indicating if the values of the continuous variable(s) should be rounded to the standard number of decimal places. Requires `units = TRUE`. Defaults to `round = FALSE`, i.e. no rounding is performed. See `dictionary()` for more information.
`harmonization`	(optional) List with named sub-arguments indicating if and how to perform data harmonization. `harmonize` Logical value indicating if data should be harmonized. Defaults to `harmonize = FALSE`, i.e. no harmonization is performed. `level` Integer value indicating the number of levels of the identification code of the variable(s) that should be considered for harmonization. Defaults to `level = 2`. See ‘Details’ for more information.
`progress`	(optional) Logical value indicating if a download progress bar should be displayed.
`verbose`	(optional) Logical value indicating if informative messages should be displayed. Generally useful to identify datasets with inconsistent data. Please report to febr-forum@googlegroups.com if you find any issue.
`febr.repo`	(optional) Defaults to the remote file directory of the Federal University of Technology - Paraná at https://cloud.utfpr.edu.br/index.php/s/Df6dhfzYJ1DDeso. Alternatively, a local directory path can be informed if the user has a local copy of the data repository.

Details

Default variables

Default variables (fields) present in the 'observation' table are as follows:

dataset_id. Identification code of the dataset in the FEBR to which an observation belongs.
evento_id_febr. Identification code of an observation in a dataset.
evento_data. Date (dd-mm-yyyy) in which an observation was made.
coord_datum. EPSG code of the coordinate reference system.
coord_longitude. Longitude (deg) or easting (m).
coord_latitude. Latitude (deg) or northing (m).
coord_precisao. Precision with which the spatial coordinates were determined (m).
coord_fonte. Source of the spatial coordinates.
pais_id. Code (ISO 3166-1 alpha-2) of the county where an observation was made.
estado_sigla. Acronym of the Brazilian federative unit where an observation was made.
municipio_nome. Name of the Brazilian municipality where as observation was made.
subamostra_quanti. Number of sub samples taken (used to indicate composite sampling).
amostra_area. Sampling area (used to indicate areal or block sampling).

Further details about the content of the default variables (fields) can be found in https://docs.google.com/document/d/1Bqo8HtitZv11TXzTviVq2bI5dE6_t_fJt0HE-l3IMqM (in Portuguese).

Harmonization

Data harmonization consists of converting the values of a variable determined using some method B so that they are (approximately) equivalent to the values that would have been obtained if the standard method A had been used instead. For example, converting carbon content values obtained using a wet combustion method to the standard dry combustion method is data harmonization.

A heuristic data harmonization procedure is implemented in the febr package. It consists of grouping variables based on a chosen number of levels of their identification code. For example, consider a variable with an identification code composed of four levels, aaa_bbb_ccc_ddd, where aaa is the first level and ddd is the fourth level. Now consider a related variable, aaa_bbb_eee_fff. If the harmonization is to consider all four coding levels (level = 4), then these two variables will remain coded as separate variables. But if level = 2, then both variables will be re-coded as aaa_bbb, thus becoming the same variable.

Value

A list of data.frames or a data.frame with, possibly standardize or harmonized, data of the chosen variable(s) of the chosen dataset(s).

Author(s)

Alessandro Samuel-Rosa alessandrosamuelrosa@gmail.com

Examples

if (interactive()) {
res <- observation(data.set = "ctb0013")

# Download two data sets and standardize CRS
res <- observation(
  data.set = paste("ctb000", 4:5, sep = ""),
  variable = "taxon",
  standardization = list(crs = "EPSG:4674"))

# Try to download a data set that is not available yet
res <- observation(data.set = "ctb0020")

# Try to download a non existing data set
#res <- observation(data.set = "ctb0000")

}

samuel-rosa/febr documentation built on April 24, 2022, 11:46 a.m.