The ridigbio package can be used to obtain records from iDigBio API's, including both the Search API and the Media APIs.
In this demo we will cover how to:
ridigbio
idig_search_records()
idig_search_media()
First, you must install the ridigbio package. If you are new to R and R studio, please refer to our QUBES module to get started: Introduction to R with Biodiversity Data, doi:10.25334/84FC-TE88 .
The lastest version of our R package can be installed via CRAN.
install.packages("ridigbio")
Before downloading any records, you must load the ridigbio package.
library(ridigbio)
verify_galax_records <- FALSE #Test that examples will run tryCatch({ # Your code that might throw an error verify_galax_records <- idig_search_records(rq=list(scientificname="Galax urceolata"), limit = 10 ) }, error = function(e) { # Code to run if an error occurs cat("An error occurred during the idig_search_records call: ", e$message, "\n") cat("Vignettes will not be fully generated. Please try again after resolving the issue.") # Optionally, you can return NULL or an empty dataframe verify_galax_records <- FALSE })
To download records from the Search API, we will use the function idig_search_records()
. Here the rq
, or record query, indicates we want to download all the records where the scientificname
is equal to Galax urceolata.
galax_records <- idig_search_records(rq=list(scientificname="Galax urceolata"))
colnames(galax_records)
When fields are not specified, default columns include the following:
| Column | Description | |----------------|--------------------------------------------------------| | uuid | Universally Unique IDentifier assigned by iDigBio | | occurrenceid | identifier for the occurrence, https://rs.tdwg.org/dwc/terms/occurrenceID | | catalognumber | identifier for the record within the collection, https://rs.tdwg.org/dwc/terms/catalogNumber | | family | scientific name of the family, https://rs.tdwg.org/dwc/terms/family | | genus | scientific name of the genus, https://rs.tdwg.org/dwc/terms/genus | | scientificname | scientific name, https://rs.tdwg.org/dwc/terms/scientificName | | country | country, https://rs.tdwg.org/dwc/terms/country | | stateprovince | name of the next smaller administrative region than country, https://rs.tdwg.org/dwc/terms/stateProvince | | geopoint.lon | equivalent to decimalLongitude, https://rs.tdwg.org/dwc/terms/decimalLongitude | | geopoint.lat | equivalent to decimalLatitude,https://rs.tdwg.org/dwc/terms/decimalLatitude | | datecollected | Modified field and could lack biological meaning | | data.dwc:eventDate | equivalent to eventDate, https://dwc.tdwg.org/list/#dwc_eventDate | | data.dwc:year | year of collection event, https://dwc.tdwg.org/list/#dwc_year | | data.dwc:month | month of collection event, https://dwc.tdwg.org/list/#dwc_month | | data.dwc:day | day of collection event | | collector | equivalent to recordedBy, https://rs.tdwg.org/dwc/terms/recordedBy | | recordset | indicates the iDigBio recordset the observation belongs too! |
In addition to scientificname
, record query may be based on many other fields. For example, you can search for all members of the family
Diapensiaceae:
diapensiaceae_records <- idig_search_records(rq=list(family="Diapensiaceae"), limit=1000)
What if you want to read in all the points for a family within an extent?
Hint: Use the iDigBio portal to determine the bounding box for your region of interest.
The bounding box delimits the geographic extent.
rq_input <- list("scientificname"=list("type"="exists"), "family"="Diapensiaceae", geopoint=list( type="geo_bounding_box", top_left=list(lon = -98.16, lat = 48.92), bottom_right=list(lon = -64.02, lat = 23.06) ) )
Search using the input you just made
diapensiaceae_records_USA <- idig_search_records(rq_input, limit=1000)
To download media records from the Media API, we will use the function idig_search_media()
. Here the rq
, or record query, indicates we want to download all the records where the scientificname
is equal to Galax urceolata.
galax_media <- idig_search_media(rq=list(scientificname="Galax urceolata"))
colnames(galax_media)
When fields are not specified, default columns include the following:
| Column | Description | |---------------|---------------------------------------------------------| | accessuri | Unique identifier for a resource, https://ac.tdwg.org/termlist/#ac_accessURI | | datemodified | date last modified, which is assigned by iDigBio | | dqs | data quality score assigned by iDigBio | | etag | tag assigned by iDigBio | | flags | data quality flag assigned by iDigBio | | format | media format, https://purl.org/dc/terms/format | | hasSpecimen | TRUE or FALSE, indicates if there is an associated record for this media | | licenselogourl | media license, https://ac.tdwg.org/termlist/#ac_licenseLogoURL) | | mediatype | media object type | | modified | date modified, https://purl.org/dc/terms/modified | | recordids | list of UUID for associated records | | records | UUID for the associated record. Use this field to connect Record downloads with Media downloads | | recordset | indicates the iDigBio recordset the observation belongs too! | | rights | media rights, https://purl.org/dc/terms/rights | | tag | general keywords or tags, https://rs.tdwg.org/ac/terms/tag | | type | media type, https://purl.org/dc/terms/type | | uuid | Universally Unique IDentifier assigned by iDigBio | | version | media record version assigned by iDigBio | | webstatement | media rights, https://developer.adobe.com/xmp/docs/XMPNamespaces/xmpRights/ | | xpixels | as defined by EXIF, x dimension in pixel | | ypixels | as defined by EXIF,y dimension in pixels |
The media search above retained r tryCatch({if(nrow(galax_media)) nrow(galax_media) else "N/A"}, error = function(e){cat("error in vignette: ", e$message)})
rows, however some of these observations do not have information in the accessuri
field. To only obtain records with acessuri
, we indicate we only want records where data.ac:accessURI
exist, by setting mq
, or media query, as followed:
galax_media2 <- idig_search_media(rq=list(scientificname="Galax urceolata"), mq=list("data.ac:accessURI"=list("type"="exists")))
Now we have r tryCatch({if(nrow(galax_media2)) nrow(galax_media2) else "N/A"}, error = function(e){cat("error in vignette: ", e$message)})
observations with accessuri
!
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.