format_gq: Prepare data frame for flagging functions

View source: R/format_gq.R

format_gqR Documentation

Prepare data frame for flagging functions

Description

format_gq renames certain fields to make sure the API knows how to use them. This step is highly recommended for the proper assessment of the provided data.frame.

Usage

format_gq(indf, source = NULL, config = NULL, quiet = FALSE, ...)

## Default S3 method:
format_gq(indf, source = NULL, config = NULL,
  quiet = FALSE, ...)

## S3 method for class 'data.frame'
format_gq(indf, source = NULL, config = NULL,
  quiet = FALSE, ...)

Arguments

indf

Required. The data.frame on which to operate.

source

Optional. Indicates the package that was used to retrieve the data. Currently accepted values are "rvertnet", "rgbif" or "rinat". Either source, config or individual parameters must be present (see details).

config

Optional. Configuration object indicating mapping of field names from the data.frame to the DarwinCore standard. Useful when importing data multiple times from a source not available via the source argument. Either source, config or individual parameters must be present (see details).

quiet

Optional. Don't show any logging message at all. Defaults to FALSE.

...

Optional. If none of the previous is present, the four key arguments (decimalLatitude, decimalLongitude, countryCode, scientificName) can be put here. See examples.

Details

When invoked, there are three ways of indicating the function how to transform the data.frame: using the source parameter, providing a config object with field mapping, or passing individual values to the mapping function. This is the order in which the function will parse arguments; source overrides config, which overrides other mapping arguments.

source refers to the package that was used to retrieve the data. Currently, three values are supported for this argument: "rgbif", "rvertnet" and "rinat", but many more are on their way.

config asks for a configuration object holding the mapping of the field names. This option is basically a shortcut for those users with custom-formatted data.frames who will use the same mapping many times, to avoid having to type them each time. In practice, this object is a named list with the following four fields: decimaLatitude, decimaLongitude, countryCode and scientificName. Each element must have a string indicating the name of the column in the data.frame holding the values for that element. If the data.frame doesn't have one or more of these fields, put NA in that element; otherwise, the function will throw an error. See the examples section.

If none of the two is provided, the function expects the user to provide the mapping by passing the individual column names associated with the right term of the DarwinCore Standard. See the examples section.

Value

The provided data frame, with field names changed to fit the API functioning.

See Also

add_flags

Examples

## Not run: 
# Using the rgbif package and the source argument
if (requireNamespace("rgbif", quietly=TRUE)) {
 d <- rgbif::occ_data(scientificName="Apis mellifera", limit=50, minimal=FALSE)
 d <- d$data
 d <- format_gq(d, source="rgbif")

 # Using a configuration object (matches 'rinat' schema)
 conf <- list(decimalLatitude="latitude",
              decimalLongitude="longitude",
              countryCode=NULL,
              scientificName="scientific_name")
 d <- format_gq(d, config=conf)

 # Passing individual parameters, all optional
 d <- format_gq(d,
                decimalLatitude="lat",
                decimalLongitude="lng",
                countryCode="ccode",
                scientificName="sciname")
}

## End(Not run)

ropenscilabs/rgeospatialquality documentation built on May 18, 2022, 7:42 p.m.