search_data: Search published data

View source: R/search_data.R

search_dataR Documentation

Search published data

Description

Search published data

Usage

search_data(text, taxa, num_taxa, num_years, sd_years, area, boolean = "AND")

Arguments

text

(character) Text to search for in dataset titles, descriptions, and abstracts. Datasets matching any exact words or phrase will be returned. Can be a regular expression as used by stringr::str_detect(). Is not case sensitive. Works with boolean.

taxa

(character) Taxonomic rank values to search on. The full taxonomic hierarchy of each taxa in a dataset is searchable for EDI (including common names) but not yet NEON, in which cases the lowest level rank value is searchable.

num_taxa

(numeric) Minimum and maximum number of taxa the dataset should contain. Any datasets within this range will be returned.

num_years

(numeric) Minimum and maximum number of years sampled the dataset should contain. Any datasets within this range will be returned.

sd_years

(numeric) Minimum and maximum standard deviation between survey dates (in years). Any datasets within this range will be returned.

area

(numeric) Bounding coordinates within which the data should originate. Accepted values are in decimal degrees and in the order: North, East, South, West. Any datasets with overlapping areas or contained points will be returned.

boolean

(character) Boolean operator to use when searching text and taxa. Supported operators are: "AND", "OR". Default is "AND".

Details

Currently, to accommodate multiple L1 versions of NEON data products, search results for a NEON L0 will also list all the L1 versions available for the match. This method is based on the assumption that the summary data among L1 versions is the same, which may need to be addressed in the future. A list of L0 and corresponding L1 identifiers are listed in /inst/L1_versions.txt. Each L1 version is accompanied by qualifying text that's appended to the title, abstract, and descriptions for comprehension of the differences among L1 versions.

Value

(tbl_df, tbl, data.frame) Search results with these feilds:

  • source - Source from which the dataset originates. Currently supported are "EDI" and "NEON".

  • id - Identifier of the dataset.

  • title - Title of the dataset.

  • description - Description of dataset. Only returned for NEON datasets.

  • abstract - Abstract of dataset.

  • years - Number of years sampled.

  • sampling_interval - Standard deviation between sampling events in years.

  • sites - Sites names or abbreviations. Only returned for NEON datasets.

  • url - URL to dataset.

  • source_id - Identifier of source L0 dataset.

  • source_id_url - URL to source L0 dataset.

Note

This function may not work between 01:00 - 03:00 UTC on Wednesdays due to regular maintenance of the EDI Data Repository.

Examples

## Not run: 
# Empty search returns all available datasets
search_data()

# "text" searches titles, descriptions, and abstracts
search_data(text = "Lake")

# "taxa" searches taxonomic ranks for a match
search_data(taxa = "Plantae")

# "num_years" searches the number of years sampled
search_data(num_years = c(10, 20))

# Use any combination of search fields to find the data you're looking for
search_data(
  text = c("Lake", "River"),
  taxa = c("Plantae", "Animalia"),
  num_taxa = c(0, 10),
  num_years = c(10, 100),
  sd_years = c(.01, 100),
  area = c(47.1, -86.7, 42.5, -92),
  boolean = "OR")

## End(Not run)


ecocomDP documentation built on July 9, 2023, 6:42 p.m.