#' Species occurrence records on U.S. Fish and Wildlife properties
#'
#' Geographic query of biodiversity databases based on U.S. Fish and Wildlife
#' Service (USFWS) property boundaries
#'
#' **Important usage limitations**: This function exists strictly to extract
#' occurrence data for a given \code{fws} property. Attempts at estimating
#' or inferring relative abundance are most strongly discouraged and almost
#' certainly meaningless.
#'
#' Also note that the extraction of records occurs on a property-by-property
#' basis so the same record may occur in multiple polygons depending on
#' buffer specifications.
#'
#' @section General overview:
#' The basic process is to query GBIF, iDigBio, the Berkeley
#' 'Ecoinformatics' Engine, and AntWeb based on the bounding box
#' associated with each \code{fws} and any requested buffer. For VertNet,
#' the API requires spatial searches using a lat/lon coordinate and search
#' radius. We calculate the radius needed to fully capture the desired
#' geometry. Records are subsequently filtered based on the exact geometry.
#'
#' Properties that occupy a relatively small area compared to the corresponding
#' bounding box will be split into smaller pieces to avoid unnecessarily large
#' queries. Likewise, queries (typically GBIF) that contain many
#' records (> 125000) are split to improve efficiency (both) and recover all
#' records (GBIF).
#'
#' We provide the options to:
#' \itemize{
#' \item scrub records to reduce the number of returned records for each
#' \code{fws} (details below);
#' \item attempt to link each observation to some standardized taxonomic information
#' (details below)
#' }
#'
#' @section Scrubbing details:
#' By default (\code{scrub = "strict"}), \code{fws_occ} scrubs a lot of records.
#' Specifically, within a given property, it retains a single record for each
#' species, giving preference to records with associated media (e.g., photo, audio).
#' The retained record is typically the most recent with evidence to support the
#' identification. Records for which evidence was not available (i.e., no associated
#' collection or catalog number) are removed. Moderate scrubbing
#' (\code{scrub = "moderate"}) attempts only to remove duplicate records (i.e., identical
#' catalog numbers) and redundant observations (i.e., multiple individuals of the same
#' species recorded on the same date at a single location). All records can be
#' returned with (\code{scrub = "none"}).
#'
#' @section Taxonomy information details:
#' By default (\code{taxonomy = TRUE}), \code{fwspp} attempts to check the validity of
#' scientific names against the Integrated Taxonomic Information System (ITIS). It does
#' this not by connecting to ITIS directly, but by requesting information from a web
#' service maintained by the U.S. Fish and Wildlife Service as part of their FWSpecies database
#' Note that this means if taxonomy information
#' is requested, and an ITIS match found, the scientific name will be converted to the
#' "accepted" ITIS scientific name, and the corresponding ITIS Taxonomic Serial Number,
#' an NPS-specific taxon code, a common name used by USFWS, and a general organism
#' "category" (e.g., Mammals, Birds, Fungi) are returned.
#'
#' @section Additional boundary information:
#' The boundaries specified by the "admin" option to the \code{bnd} argument
#' delineates those lands and waters administered by the USFWS in North
#' America, U.S. Trust Territories and Possessions. It may also include
#' inholdings that are not administered by the USFWS. The primary source
#' for this information is the USFWS Realty program. See
#' \url{https://ecos.fws.gov/ServCat/Reference/Profile/60739} for more
#' information.
#'
#' The boundaries specified by the "acq" option to the \code{bnd} argument
#' delineates the external boundaries of lands and waters that are approved
#' for acquisition by the USFWS in North America, U.S. Trust Territories and
#' Possessions. The primary source for this information is the USFWS Realty
#' program. See \url{https://ecos.fws.gov/ServCat/Reference/Profile/60738} for
#' more information.
#'
#' @section Query timeout details:
#' By default, timeout is calculated based on testing of BISON queries on a ~ 20 Mbps
#' internet connection. BISON queries are nearly always the largest (by # records)
#' *contiguous* requests (GBIF is slower but requests occur in smaller chunks). If
#' timeouts are a recurring problem, however, it may be worth checking your download
#' speed (e.g. \url{http://www.speedtest.net}) and setting this parameter if your
#' speeds are considerably below ~ 20 Mbps to allow more time for HTTP requests to
#' process. For example, if your download speed is estimated at 5 Mbps, you might
#' consider setting this parameter to \code{timeout = 4} or so. If regular timeout
#' persist after adjusting this parameter, however, please contact the maintainer
#' with details or, better yet, file an issue at
#' \url{https://github.com/USFWS/fwspp/issues}.
#'
#' @param fws \code{data.frame} of organizational names (ORGNAME) and
#' type (RSL_TYPE) of USFWS properties and their associated USFWS region
#' (FWSREGION), within and around which the user wishes to extract species
#' occurrence records. It is strongly recommended to use the results generated
#' by \code{\link{find_fws}} to ensure proper content and formatting. See examples.
#' @param bnd character scalar indicating the type of property boundary to use.
#' Default ("admin") uses the current administrative boundary. The approved
#' acquisition boundary ("acq") is another option.
#' @param buffer numeric scalar; distance (km) from the \code{fws} \code{bnd} to
#' include in the search for species occurrence records. Default is no buffer.
#' @param scrub character; one of "strict" (default), "moderate", or "none",
#' indicating the extent to which to reduce the number of records returned for
#' a given \code{fws}. See details.
#' @param taxonomy logical (default TRUE); attempt to link occurrence records to
#' standardized taxon information? See details.
#' @param verbose logical (default TRUE); print messages during species occurrence
#' queries?
#' @param timeout numeric; if specified, serves as a multiplier for the timeout
#' value calculated internally (e.g., \code{timeout = 2} doubles the amount of
#' time to allow for HTTP requests to process. By default (\code{timeout = NULL}),
#' the query timeout is set programmatically and conservatively. See details.
#'
#' @importFrom pbapply pbapply
#' @importFrom purrr safely
#'
#' @export
#'
#' @return an \code{list} of class \code{fwspp} with observations for each property with
#' the following columns if taxonomic information is requested (\code{taxonomy = TRUE};
#' default). If \code{taxonomy = FALSE}, only a subset of these columns is returned.
#' \describe{
#' \item{org_name}{official organizational name of USFWS property}
#' \item{sci_name}{Scientific name associated with the observation.}
#' \item{lon}{Longitude (WGS84) of observation.}
#' \item{lat}{Latitude (WGS84) of observation.}
#' \item{loc_unc_m}{Locational uncertainty (m) of observation coordinates; typically
#' unavailable.}
#' \item{year}{Year of observation, if available.}
#' \item{month}{Month of observation, if available.}
#' \item{day}{Day of month of observation, if available.}
#' \item{evidence}{Attempt to link observation to a URL that best substantiates the
#' observation, if available, generally in the following order: (1) URL to
#' observation with media (photo, audio, video) or the media itself, (2) URL
#' to the observation in the original collection, (3) URL of the collection,
#' with catalog number, or (4) URL of the institution housing the collection,
#' with catalog number.}
#' \item{bio_repo}{Biodiversity database source of the observation, currently one of
#' GBIF, BISON, iDigBio, VertNet, EcoEngine, or AntWeb.}
#' \item{acc_sci_name}{Accepted/valid scientific name from ITIS, if available.}
#' \item{com_name}{Regularly used vernacular names, if available.}
#' \item{taxon_rank}{Taxonomic rank of observation.}
#' \item{category}{Generic taxonomic grouping for taxa used in FWSpecies database.}
#' \item{tsn}{Accepted/valid Taxonomic Serial Number from ITIS, if available.}
#' \item{note}{Additional notes on the observation, currently restricted to indicating
#' that a matching taxon was not found in ITIS or trouble singling out a taxon
#' from FWSpecies.}
#' }
#'
#' @examples
#' \dontrun{
#' # Single refuge, administrative boundary, no buffer
#' # By default, records are scrubbed very strictly (see Details)
#' # Mountain Longleaf National Wildlife Refuge
#' ml <- find_fws("longleaf")
#' ml_occ <- fws_occ(fws = ml)
#'
#' # Multiple refuges, acquisition boundary with 5 km buffer, moderate scrubbing,
#' # no taxonomy info added
#' multi <- find_fws(c("longleaf", "romain"))
#' multi_occ <- fws_occ(fws = multi, bnd = "acq", buffer = 5, scrub = "moderate",
#' taxonomy = FALSE)
#'
#' # All Region 4 (southeast) refuges, with defaults
#' r4 <- find_fws(region = 4)
#' r4_occ <- fw_spp(r4)
#' }
fws_occ <- function(fws = NULL, bnd = c("admin", "acq"),
buffer = 0, scrub = c("strict", "moderate", "none"),
taxonomy = TRUE, verbose = TRUE,
timeout = NULL,
start_day = NULL, start_month = NULL, start_year = NULL) {
if (is.null(fws) || !is.data.frame(fws))
stop("You must provide valid property names to query.",
"\nSee `?find_fws` for examples.")
if (is.null(start_day))
start_day <- 1
if (is.null(start_month))
start_month <- 1
if (is.null(start_year))
start_year <- 1776
start_date <- as.POSIXlt(paste0(start_year, "-", start_month, "-", start_day))
bnd <- match.arg(bnd)
scrub <- match.arg(scrub)
# Import spatial data for relevant properties
check_cadastral()
props <- prep_properties(fws, bnd, verbose)
start_time <- Sys.time()
# Cycle through properties
if (verbose)
out <- lapply(fws$ORGNAME, function(prop) {
retrieve_occ(props, prop, buffer, scrub, timeout, start_date)})
else
out <- pbapply::pblapply(fws$ORGNAME, function(prop) {
suppressMessages(
retrieve_occ(props, prop, buffer, scrub, timeout, start_date))})
attributes(out) <- list(names = fws$ORGNAME,
class = "fwspp",
boundary = bnd, scrubbing = scrub,
buffer_km = buffer,
query_dt = start_time)
# Taxomony linking, if requested
if (taxonomy) {
safe_tax <- purrr::safely(add_taxonomy)
out_tax <- safe_tax(out)
if (is_error(out_tax)) {
message("Taxonomy retrieval failed with the following error:\n ",
out_tax$error$message)
message(
wrap_message(paste("Skipping taxonomy. Please send the resulting `fwspp` object",
"to the maintainer of the `fwspp` package.\n",
"You may also try again later using `fwspp::add_taxonomy`.")))
}
else out <- out_tax$result
}
out
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.