galah_filter: Narrow a query by specifying filters

View source: R/galah_filter.R

galah_filterR Documentation

Narrow a query by specifying filters

Description

"Filters" are arguments of the form field logical value that are used to narrow down the number of records returned by a specific query. For example, it is common for users to request records from a particular year (year == 2020), or to return all records except for fossils (basisOfRecord != "FossilSpecimen").

The result of galah_filter() can be passed to the filter argument in atlas_occurrences(), atlas_species(), atlas_counts() or atlas_media().

Usage

galah_filter(..., profile = NULL)

## S3 method for class 'data_request'
filter(.data, ...)

## S3 method for class 'metadata_request'
filter(.data, ...)

## S3 method for class 'files_request'
filter(.data, ...)

Arguments

...

filters, in the form ⁠field logical value⁠

profile

[Deprecated] Use galah_apply_profile instead.

.data

An object of class files_request, created using request_files()

Details

galah_filter uses non-standard evaluation (NSE), and is designed to be as compatible as possible with dplyr::filter() syntax.

All statements passed to galah_filter() (except the profile argument) take the form of field - logical - value. Permissible examples include:

  • = or == (e.g. year = 2020)

  • !=, e.g. year != 2020)

  • > or >= (e.g. year >= 2020)

  • < or <= (e.g. year <= 2020)

  • OR statements (e.g. year == 2018 | year == 2020)

  • AND statements (e.g. year >= 2000 & year <= 2020)

In some cases R will fail to parse inputs with a single equals sign (=), particularly where statements are separated by & or |. This problem can be avoided by using a double-equals (==) instead.

Notes on behaviour

Separating statements with a comma is equivalent to an AND statement; Ergo galah_filter(year >= 2010 & year < 2020) is the same as galah_filter(year >= 2010, year < 2020).

All statements must include the field name; so galah_filter(year == 2010 | year == 2021) works, as does galah_filter(year == c(2010, 2021)), but galah_filter(year == 2010 | 2021) fails.

It is possible to use an object to specify required values, e.g. ⁠year_value <- 2010; galah_filter(year > year_value)⁠

solr supports range queries on text as well as numbers; so this is valid: galah_filter(cl22 >= "Tasmania")

Value

A tibble containing filter values.

See Also

search_taxa() and galah_geolocate() for other ways to restrict the information returned by atlas_occurrences() and related functions. Use search_all(fields) to find fields that you can filter by, and show_values() to find what values of those filters are available.

Examples

## Not run: 
# Filter query results to return records of interest
galah_call() |>
  galah_filter(year >= 2019,
               basisOfRecord == "HumanObservation") |>
  atlas_counts()

# Alternatively, the same call using `dplyr` functions:
request_data() |>
  filter(year >= 2019,
               basisOfRecord == "HumanObservation") |>
  count() |>
  collect()

## End(Not run)

galah documentation built on Nov. 20, 2023, 9:07 a.m.