R/atlas_occurrences.R

Defines functions atlas_occurrences

Documented in atlas_occurrences

#' Retrieve a database query
#'
#' @description
#' An alternative to using \code{\link[=collect.data_request]{collect()}} at the 
#' end of a query pipe is to call a function with the `atlas_` prefix. These
#' solutions are basically synonymous, but `atlas_` functions differ in two ways:
#' 
#'   * They have the ability to accept `filter`, `select` etc as arguments,
#'     rather than within a pipe; but **only** when using the `galah_` forms of 
#'     those functions (e.g. [galah_filter()]).
#'   * `atlas_` functions do not require you to specify the `method` or `type` 
#'     arguments to [galah_call()], as they are more specific in what data are 
#'     being requested.
#'
#' @name atlas_
#' @order 1
#' @param request optional `data_request` object: generated by a call to
#' [galah_call()].
#' @param identify `tibble`: generated by a call to [galah_identify()].
#' @param filter `tibble`: generated by a call to [galah_filter()]
#' @param geolocate `string`: generated by a call to [galah_geolocate()]
#' @param data_profile `string`: generated by a call to [galah_apply_profile()]
#' @param select `tibble`: generated by a call to [galah_select()] 
#' @param mint_doi `logical`: by default no DOI will be generated. Set to
#' `TRUE` if you intend to use the data in a publication or similar.
#' @param doi `string`: (Optional) DOI to download. If provided overrides
#' all other arguments. Only available for the ALA.
#' @param file `string`: (Optional) file name. If not given, will be set to 
#' `data` with date and time added. The file path (directory) is always given by 
#' `galah_config()$package$directory`. 
#' @details
#' Note that unless care is taken, some queries can be particularly large.
#' While most cases this will simply take a long time to process, if the number
#' of requested records is >50 million, the call will not return any data. Users
#' can test whether this threshold will be reached by first calling
#' [atlas_counts()] using the same arguments that they intend to pass to
#' `atlas_occurrences()`. It may also be beneficial when requesting a large
#' number of records to show a progress bar by setting `verbose = TRUE` in
#' [galah_config()], or to use `compute()` to run the call before collecting
#' it later with `collect()`.
#' @return An object of class `tbl_df` and `data.frame` (aka a tibble). For
#' `atlas_occurrences()` and `atlas_species()`, this will have columns specified 
#' by \code{\link[=select.data_request]{select()}}. For `atlas_counts()`, 
#' it will have columns specified by 
#' \code{\link[=group_by.data_request]{group_by()}}.
#' @examples \dontrun{
#' # Best practice is to first calculate the number of records
#' galah_call() |>
#'   filter(year == 2015) |>
#'   atlas_counts()
#' 
#' # Download occurrence records for a specific taxon
#' galah_config(email = "your_email_here") # login required for downloads
#' galah_call() |>
#'   identify("Reptilia") |>
#'   atlas_occurrences()
#'
#' # Download occurrence records in a year range
#' galah_call() |>
#'   identify("Litoria") |>
#'   filter(year >= 2010 & year <= 2020) |>
#'   atlas_occurrences()
#'   
#' # Download occurrences records in a WKT-specified area
#' polygon <- "POLYGON((146.24960 -34.05930,
#'                      146.37045 -34.05930,
#'                      146.37045 -34.152549,
#'                      146.24960 -34.15254,
#'                      146.24960 -34.05930))"
#' galah_call() |> 
#'   identify("Reptilia") |>
#'   filter(year >= 2010, year <= 2020) |>
#'   st_crop(polygon) |>
#'   atlas_occurrences()
#'   
#' # Get a list of species within genus "Heleioporus"
#' # (every row is a species with associated taxonomic data)
#' galah_call() |>
#'   identify("Heleioporus") |>
#'   atlas_species()
#'
#' # Download Regent Honeyeater records with multimedia attached
#' # Note this returns one row per multimedia file, NOT one per occurrence
#' galah_call() |>
#'   identify("Regent Honeyeater") |>
#'   filter(year == 2011) |>
#'   atlas_media()
#' 
#' # Get a taxonomic tree of *Chordata* down to the class level
#' galah_call() |> 
#'   identify("chordata") |>
#'   filter(rank == class) |>
#'   atlas_taxonomy()
#' }
#' @export
atlas_occurrences <- function(request = NULL,
                              identify = NULL,
                              filter = NULL,
                              geolocate = NULL,
                              data_profile = NULL,
                              select = NULL,
                              mint_doi = FALSE,
                              doi = NULL,
                              file = NULL
                              ) {
  if(!is.null(doi)){
    request_data() |>
      filter(doi == doi) |>
      collect(file = file)
  }else{
    args <- as.list(environment()) # capture supplied arguments
    check_atlas_inputs(args) |> # convert to `data_request` object
      collect(wait = TRUE,
              file = file)
  }
}
AtlasOfLivingAustralia/galah documentation built on Feb. 8, 2025, 9:25 a.m.