getTaxaIDs: getTaxaIDs

Description Usage Arguments Author(s) See Also Examples

Description

This function attempts to determine a definitive AphiaID and TSN from various web services via some existing R packages including worrms, taxize and ritis.

It takes a species list with Scientific and Common names and performs checks in the following order:

  1. Check scientific names for AphiaIDs using taxize, then worrms

  2. Check scientific names for missing AphiaIDs using worrms

  3. Use found AphiaIDs to look for the TSNs

  4. Check scientific names for missing TSNs using ritis

  5. Use found TSNs to look for the AphiaIDs

  6. Check common names for missing AphiaIDs using taxize, then worrms

  7. Check common names for missing AphiaIDs using worrms

  8. Use found AphiaIDs to look for the missing TSNs

  9. Check common names for missing TSNs using ritis

  10. Use found TSNs to look for the missing AphiaIDs

In addition to all of the original fields, the returned data will include:

  1. APHIAID

  2. APHIAID_SRC - what data was used to find the Aphiaid value (e.g. scientific name, common name)

  3. APHIAID_SVC - which service(s) provided the Aphiaid value (e.g. taxize, ritis, worrms)

  4. APHIAID_DEFINITIVE - TRUE indicates a single, confident match with a service, FALSE indicates that either several potential matches were found, or that the matches were recognized as authoritative by the service

  5. APHIAID_SPELLING - alternative spellings suggested for the APHIAID_SRC suggested by the service that found the APHIAID

  6. TSN

  7. TSN_SRC - what data was used to find the TSN value (e.g. scientific name, common name, APHIAID)

  8. TSN_SVC - which service(s) provided the TSN value (e.g. taxize, ritis, worrms)

  9. TSN_definitive - TRUE indicates a single, confident match with a service, FALSE indicates that either several potential matches were found, or that the matches were recognized as authoritative by the service

  10. TSN_SPELLING - alternative spellings suggested for the TSN_SRC suggested by the service that found the TSN

  11. ID_SRC - populated with the version of this script that was used to find a match

Usage

1
2
getTaxaIDs(spec_list = NULL, sci_col = NULL, comm_col = NULL,
  sci_Filts = NULL, comm_Filts = NULL, debug = F)

Arguments

spec_list

the dataframe containing information to be decoded to TSN and aphiaIDs

sci_col

the name of the column of the dataframe containing the scientific names

comm_col

the name of the column of the dataframe containing the common names

sci_Filts

default is NULL - a vector of regex values that you might want to filter out of your scientific names prior to sending them to a web service. For example, some Maritimes names inlude "(NS)", which will prevent services from finding matches, By adding "\(NS\)" (escaping the brackets and periods with slashes), we ensure that the results will be as clean as possible prior to searching.

comm_Filts

default is NULL - a vector of regex values that you might want to filter out of your common names prior to sending them to a web service. By default, the following values will be filtered from both sci_col and comm_col:

  • \(.*?\) -removes thing in brackets (e.g "(NS)")

  • \b[a-zA-Z]1,2\. - removes one or two letter blocks of text (potentially followed by a period) (e.g "SP.")

  • \,\s?(SMALL|LARGE) - removes instances like ",SMALL"

  • UNIDENTIFIED - removes the word "UNIDENTIFIED"

  • UNID\. - removes the word "UNID."

  • EGGS - removes the word "EGSS"

  • PURSE\s" - removes the word "PURSE"

debug

default is FALSE This just ensure that the log file is overwritten rather than making many new ones.

Author(s)

Mike McMahon, Mike.McMahon@dfo-mpo.gc.ca

See Also

Other speciesCodes: assignDefinitive, cleanPrepareSpecList, do_ritis, do_taxize, do_worrmsAphiaID, do_worrmsTSN, do_worrms

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
testData <- data.frame(
  internals_codes = 1:7,
  sci_names = c("OSMERUS MORDAX", "SELACHII (CHONDRICHTHYES) (CLASS)",
  "LIPARIS  SP.","OSTRACIONTIDAE (OSTRACIIDAE)",
  "PHYLLODOCE SP.","SPIO SP.","DENTALIUM ENTALE"),
  comm_names = c("SMELT", "CARTILAGINOUS FISHES",
  "SEASNAILS (NS) LIP.SP.","TRUNKFISHES (NS)",
  "POLYCHAETE","POLYCHAETE","TUSK SHELL")
)
getTaxaIDs(spec_list = testData, sci_col = "sci_names", comm_col = "comm_names")

Maritimes/bio.odissupport documentation built on May 31, 2019, 8:01 a.m.