nameparser
parses taxonomic names. It's an R port of the Ruby gem biodiversity
.
devtools::install_github("sckott/nameparser")
You can use it as a library in Ruby, JRuby etc.
to fix capitalization in canonicals
ScientificNameParser.fix_case("QUERCUS (QUERCUS) ALBA") # Output: Quercus (Quercus) alba
to parse a scientific name into a ruby hash
parser.parse("Plantago major")
to get json representation
parser.parse("Plantago").to_json #or parser.parse("Plantago") parser.all_json
to clean name up
parser.parse(" Plantago major ")[:scientificName][:normalized]
to get only cleaned up latin part of the name
parser.parse("Pseudocercospora dendrobii (H.C. Burnett) U. \ Braun & Crous 2003")[:scientificName][:canonical]
to get detailed information about elements of the name
parser.parse("Pseudocercospora dendrobii (H.C. Burnett 1883) U. \ Braun & Crous 2003")[:scientificName][:details]
Returned result is not always linear, if name is complex. To get simple linear representation of the name you can use:
parser.parse("Pseudocercospora dendrobii (H.C. Burnett) \ U. Braun & Crous 2003")[:scientificName][:position] # returns {0=>["genus", 16], 17=>["species", 26], # 28=>["author_word", 32], 33=>["author_word", 40], # 42=>["author_word", 44], 45=>["author_word", 50], # 53=>["author_word", 58], 59=>["year", 63]} # where the key is the char index of the start of # a word, first element of the value is a semantic meaning # of the word, second element of the value is the character index # of end of the word
'Surrogate' is a broad group which includes 'Barcode of Life' names, and various undetermined names with cf. sp. spp. nr. in them:
parser.parse("Coleoptera BOLD:1234567")[:scientificName][:surrogate]
To parse using several CPUs (4 seem to be optimal)
parser = ParallelParser.new # ParallelParser.new(4) will try to run 4 processes if hardware allows array_of_names = ["Betula alba", "Homo sapiens"....] parser.parse(array_of_names) # Output: {"Betula alba" => {:scientificName...}, # "Homo sapiens" => {:scientificName...}, ...}
parallel parser takes list of names and returns back a hash with names as keys and parsed data as values
To get canonicals with ranks for infraspecific epithets:
parser = ScientificNameParser.new(canonical_with_rank: true) parser.parse('Cola cordifolia var. puberula \ A. Chev.')[:scientificName][:canonical] # Output: Cola cordifolia var. puberula
To resolve lsid and get back RDF file
LsidResolver.resolve("urn:lsid:ubio.org:classificationbank:2232671")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.