bold
is an R package to connect to BOLD Systems (https://www.boldsystems.org/) via their API. Functions in bold
let you search for sequence data, specimen data, sequence + specimen data, and download raw trace files.
See also the taxize book for more options for taxonomic workflows with BOLD: https://taxize.dev/
Install
Install bold
from CRAN
install.packages("bold")
Or install the development version from GitHub
remotes::install_github("ropensci/bold")
Load the package
library("bold")
bold_tax_name
searches for names with names.
bold_tax_name(name = 'Diplura') #> taxid taxon tax_rank tax_division parentid parentname #> 1 603673 Diplura genus Protista 53974 Scytosiphonaceae #> 2 734358 Diplura class Animalia 20 Arthropoda #> specimenrecords taxonrep representitive_image.image #> 1 6 <NA> <NA> #> 2 308 Diplura BSOIL/GBOL20120+1540220834.jpg #> representitive_image.apectratio input #> 1 NA Diplura #> 2 0.841 Diplura
bold_tax_name(name = c('Diplura', 'Osmia')) #> taxid taxon tax_rank tax_division parentid parentname #> 1 603673 Diplura genus Protista 53974 Scytosiphonaceae #> 2 734358 Diplura class Animalia 20 Arthropoda #> 3 4940 Osmia genus Animalia 4962 Megachilinae #> specimenrecords taxonrep representitive_image.image #> 1 6 <NA> <NA> #> 2 308 Diplura BSOIL/GBOL20120+1540220834.jpg #> 3 3007 Osmia BUSA/IMG_3061.jpg #> representitive_image.apectratio input #> 1 NA Diplura #> 2 0.841 Diplura #> 3 1.486 Osmia
bold_tax_id
searches for names with BOLD identifiers.
bold_tax_id(id = 88899) #> input taxid taxon tax_rank tax_division parentid parentname #> 1 88899 88899 Momotus genus Animalia 88898 Momotidae
bold_tax_id(id = c(88899, 125295)) #> input taxid taxon tax_rank tax_division parentid parentname #> 1 88899 88899 Momotus genus Animalia 88898 Momotidae #> 2 125295 125295 Helianthus genus Plantae 151101 Asteroideae #> taxonrep #> 1 <NA> #> 2 Helianthus
The BOLD sequence API gives back sequence data, with a bit of metadata.
The default is to get a list back
bold_seq(taxon = 'Coelioxys')[1:2] #> [[1]] #> [[1]]$id #> [1] "ABEE117-17" #> #> [[1]]$name #> [1] "Coelioxys elongata" #> #> [[1]]$gene #> [1] "ABEE117-17" #> #> [[1]]$sequence #> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTATCATTATATACATATCATCCTTCCCCATCAGTTGATTTAGCAATTTTTTYTTTACATTTATCAGGAATTTYTTYTATTATCGGATCAATAAATTTTATTGTAACAATTTTAATAATAAAAAATTATTCAATAAATTATAATCAAATACCTTTATTTCCATGATCAATTTTAATTACTACAATTTTATTATTATTATCATTACCTGTATTAGCAGGAGCTATTACAATATTATTATTTGATCGTAATTTAAATTCATCATTTTTTGACCCAATAGGAGGAGGAGATCCTATTTTATATCAACATTTATTTTG------------------------------------" #> #> #> [[2]] #> [[2]]$id #> [1] "ABEE193-17" #> #> [[2]]$name #> [1] "Coelioxys rufescens" #> #> [[2]]$gene #> [1] "ABEE193-17" #> #> [[2]]$sequence #> [1] "---------------------------------------------------------------------------------------------------------------------------------------------AATAATGATATAATTTATAACTCTTTTATTACAGCTCATGCATTTTTAATAATTTTTTTTTTAGTTATACCTTTTTTAATTGGAGGATTTGGAAATTGATTAGCTCCTTTAATATTAGGAGCTCCAGATATAGCATTCCCTCGAATAAATAATATTAGATTTTGATTATTACCTCCTTCTTTATTAATATTATTAACTAGTAATTTAATTAATCCTAGACCAGGAACAGGATGAACAATTTATCCTCCTTTATCTTTATATAATTATCATCCTTCACCATCAGTAGATTTAGCAATTTTTTCTTTACATTTATCAGGAGTATCATCTATTATTGGTTCAATAAATTTTATTGTAACAATTTTATTAATAAAAAATTATTCAATAAATTATAATCAAATACCTTTATTCCCAKGATCAATTTTAATCACTACAATTTTATTATTATTATCTTTGCCTGTTTTAGCAGGAGCAATTACAATATTATTATTTGATCGAAATCTAAATTCATCCTTTTTTGACCCTTTAGGAGGAGGGGATCCAATTTTATACCAACATTTATTTTGATTTTTTGGACATCC---------------------"
You can optionally get back the crul
response object
res <- bold_seq(taxon = 'Coelioxys', response = TRUE) res$response_headers #> $status #> [1] "HTTP/2 200 " #> #> $server #> [1] "nginx" #> #> $date #> [1] "Mon, 20 Apr 2020 16:23:04 GMT" #> #> $`content-type` #> [1] "application/x-download" #> #> $`x-powered-by` #> [1] "PHP/5.3.15" #> #> $`content-disposition` #> [1] "attachment; filename=fasta.fas" #> #> $`x-frame-options` #> [1] "SAMEORIGIN" #> #> $`x-content-type-options` #> [1] "nosniff" #> #> $`x-xss-protection` #> [1] "1; mode=block"
You can do geographic searches
bold_seq(geo = "USA") #> [[1]] #> [[1]]$id #> [1] "GBAN1777-08" #> #> [[1]]$name #> [1] "Macrobdella decora" #> #> [[1]]$gene #> [1] "GBAN1777-08" #> #> [[1]]$sequence #> [1] "---------------------------------ATTGGAATCTTGTATTTCTTATTAGGTACATGATCTGCTATAGTAGGGACCTCTATA---AGAATAATTATTCGAATTGAATTAGCTCAACCTGGGTCGTTTTTAGGAAAT---GATCAAATTTACAATACTATTGTTACTGCTCATGGATTAATTATAATTTTTTTTATAGTAATACCTATTTTAATTGGAGGGTTTGGTAATTGATTAATTCCGCTAATA---ATTGGTTCTCCTGATATAGCTTTTCCACGTCTTAATAATTTAAGATTTTGATTACTTCCGCCATCTTTAACTATACTTTTTTGTTCATCTATAGTCGAAAATGGAGTAGGTACTGGATGGACTATTTACCCTCCTTTAGCAGATAACATTGCTCATTCTGGACCTTCTGTAGATATA---GCAATTTTTTCACTTCATTTAGCTGGTGCTTCTTCTATTTTAGGTTCATTAAATTTTATTACTACTGTAGTTAATATACGATGACCAGGGATATCTATAGAGCGAATTCCTTTATTTATTTGATCCGTAATTATTACTACTGTATTGCTATTATTATCTTTACCAGTATTAGCAGCT---GCTATTTCAATATTATTAACAGATCGTAACTTAAATACTAGATTTTTTGACCCAATAGGAGGAGGGGATCCTATTTTATTCCAACATTTATTTTGATTTTTTGGCCACCCTGAAGTTTATATTTTAATTTTACCAGGATTTGGAGCTATTTCTCATGTAGTAAGTCATAACTCT---AAAAAATTAGAACCGTTTGGATCATTAGGGATATTATATGCAATAATTGGAATTGCAATTTTAGGTTTTATTGTTTGAGCACATCATATATTTACAGTAGGTCTTGATGTAGATACACGAGCTTATTTTACAGCAGCTACAATAGTTATTGCTGTTCCTACAGGAATTAAAGTATTTAGGTGATTG---GCAACT" #> #> #> [[2]] #> [[2]]$id #> [1] "GBAN1780-08" #> #> [[2]]$name #> [1] "Haemopis terrestris" #> #> [[2]]$gene #> [1] "GBAN1780-08" #> #> [[2]]$sequence #> [1] "---------------------------------ATTGGAACWTTWTATTTTATTTTNGGNGCTTGATCTGCTATATTNGGGATCTCAATA---AGGAATATTATTCGAATTGAGCCATCTCAACCTGGGAGATTATTAGGAAAT---GATCAATTATATAATTCATTAGTAACAGCTCATGGATTAATTATAATTTTCTTTATGGTTATGCCTATTTTGATTGGTGGGTTTGGTAATTGATTACTACCTTTAATA---ATTGGAGCCCCTGATATAGCTTTTCCTCGATTAAATAATTTAAGTTTTTGATTATTACCACCTTCATTAATTATATTGTTAAGATCCTCTATTATTGAAAGAGGGGTAGGTACAGGTTGAACCTTATATCCTCCTTTAGCAGATAGATTATTTCATTCAGGTCCATCGGTAGATATA---GCTATTTTTTCATTACATATAGCTGGAGCATCATCTATTTTAGGCTCATTAAACTTTATTTCTACAATTATTAATATACGAATTAAAGGTATAAGATCTGATCGAGTACCTTTATTTGTATGATCAGTTGTTATTACAACAGTTCTGTTATTATTGTCTTTACCTGTTTTAGCTGCA---GCTATTACTATATTATTAACAGATCGTAATTTAAATACTACTTTTTTTGATCCTATAGGAGGTGGAGATCCAGTATTGTTTCAACACTTATTTTGATTTTTTGGTCATCCAGAAGTATATATTTTGATTTTACCAGGATTTGGAGCAATTTCTCATATTATTACAAATAATTCT---AAAAAATTGGAACCTTTTGGATCTCTTGGTATAATTTATGCTATAATTGGAATTGCAGTTTTAGGGTTTATTGTATGAGCCCATCATATATTTACTGTAGGATTAGATGTTGATACTCGAGCTTATTTTACAGCAGCTACTATAGTTATTGCTGTTCCTACTGGTATTAAAGTTTTTAGGTGATTA---GCAACA" #> #> #> [[3]] #> [[3]]$id #> [1] "GBNM0293-06" #> #> [[3]]$name #> [1] "Steinernema carpocapsae" #> #> [[3]]$gene #> [1] "GBNM0293-06" #> #> [[3]]$sequence #> [1] "---------------------------------------------------------------------------------ACAAGATTATCTCTTATTATTCGTTTAGAGTTGGCTCAACCTGGTCTTCTTTTGGGTAAT---GGTCAATTATATAATTCTATTATTACTGCTCATGCTATTCTTATAATTTTTTTCATAGTTATACCTAGAATAATTGGTGGTTTTGGTAATTGAATATTACCTTTAATATTGGGGGCTCCTGATATAAGTTTTCCACGTTTGAATAATTTAAGTTTTTGATTGCTACCAACTGCTATATTTTTGATTTTAGATTCTTGTTTTGTTGACACTGGTTGTGGTACTAGTTGAACTGTTTATCCTCCTTTGAGG---ACTTTAGGTCACCCTGGYAGAAGTGTAGATTTAGCTATTTTTAGTCTTCATTGTGCAGGAATTAGCTCAATTTTAGGGGCTATTAATTTTATATGTACTACAAAAAATCTTCGTAGTAGTTCTATTTCTTTGGAACATATAAGACTTTTTGTTTGGGCTGTTTTTGTTACTGTTTTTTTATTAGTTTTATCTTTACCTGTTTTAGCTGGTGCTATTACTATGCTTTTAACAGACCGTAATTTAAATACTTCTTTTTTT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------" #> #> #> [[4]] #> [[4]]$id #> [1] "NEONV108-11" #> #> [[4]]$name #> [1] "Aedes thelcter" #> #> [[4]]$gene #> [1] "NEONV108-11" #> #> [[4]]$sequence #> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGATCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACTCTTGATCGACTACCTTTATTCGTTTGATCTGTAGTAATTACAGCTGTTTTATTACTTCTTTCACTTCCTGTATTAGCTGGAGCTATTACAATACTATTAACTGATCGAAATTTAAATACATCTTTCTTTGATCCAATTGGAGGAGGAGACCCAATTTTATACCAACATTTATTT" #> #> #> [[5]] #> [[5]]$id #> [1] "NEONV109-11" #> #> [[5]]$name #> [1] "Aedes thelcter" #> #> [[5]]$gene #> [1] "NEONV109-11" #> #> [[5]]$sequence #> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGGTCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACTCTTGATCGACTACCTTTATTCGTTTGATCTGTAGTAATTACAGCTGTTTTATTACTTCTTTCACTTCCTGTATTAGCTGGAGCTATTACAATACTATTAACTGATCGAAATTTAAATACATCTTTCTTTGACCCAATTGGAGGGGGAGACCCAATTTTATACCAACATTTATTT"
And you can search by researcher name
bold_seq(researchers = 'Thibaud Decaens')[[1]] #> $id #> [1] "COLNO015-09" #> #> $name #> [1] "Ptiliidae" #> #> $gene #> [1] "COLNO015-09" #> #> $sequence #> [1] "AACCTTGTATTTTATGTTCGGNGCTTGAGCTGGAATAGTCGGGACAAGTTTGAGTCTCCTTATCCGAACTGAACTCGGCACTCCAGGTTCACTAATTGGAGACGACCAAATCTACAACGTAATCGTAACAGCTCATGCTTTTGTGATGATTTTTTTTATGGTCATGCCGATTTTAATCGGAGGTTTCGGAAACTGACTTGTTCCCTTGATACTTGGAGCCCCTGATATAGCTTTCCCTCGAATGAACAACATAAGGTTCTGGCTCCTCCCCCCTTCTCTGACCCTACTTTTAATGAGAAGGATAGTAGAAAGCGGAGCAGGAACAGGGTGAACAGTTTATCCTCCCCTAGCTTCAAATATTGCTCATGGTGGAGCATCAGTTGATTTGGCGATTTTTAGGCTCCATTTAGCTGGAATCTCTTCCATTCTAGGAGCCGTAAATTTCATCACAACAATTCTCAACATACGAACTCCTCAAATAAGGTTTGATCAAATGCCATTGTTTGTTTGGGCTGTTGGAATTACAGCTCTCCTTCTTCTTCTTTCACTTCCNGTTTTAGCCGGAGCTATCACAATATTGCTAAC-------------------------------------------------------------------------"
by taxon IDs
bold_seq(ids = c('ACRJP618-11', 'ACRJP619-11')) #> [[1]] #> [[1]]$id #> [1] "ACRJP618-11" #> #> [[1]]$name #> [1] "Lepidoptera" #> #> [[1]]$gene #> [1] "ACRJP618-11" #> #> [[1]]$sequence #> [1] "------------------------TTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAATTATGATCAAATACCACTATTTGTGTGATCAGTAGGAATTACTGCTTTACTCTTATTACTTTCTCTTCCAGTATTAGCAGGTGCTATCACTATATTATTAACGGATCGAAATTTAAATACATCATTTTTTGATCCTGCAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT" #> #> #> [[2]] #> [[2]]$id #> [1] "ACRJP619-11" #> #> [[2]]$name #> [1] "Lepidoptera" #> #> [[2]]$gene #> [1] "ACRJP619-11" #> #> [[2]]$sequence #> [1] "AACTTTATATTTTATTTTTGGTATTTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAATTATGATCAAATACCACTATTTGTGTGATCAGTAGGAATTACTGCTTTACTCTTATTACTTTCTCTTCCAGTATTAGCAGGTGCTATCACTATATTATTAACGGATCGAAATTTAAATACATCATTTTTTGATCCTGCAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT"
by container (containers include project codes and dataset codes)
bold_seq(container = 'ACRJP')[[1]] #> $id #> [1] "ACRJP003-09" #> #> $name #> [1] "Lepidoptera" #> #> $gene #> [1] "ACRJP003-09" #> #> $sequence #> [1] "AACATTATATTTTATTTTTGGGATCTGATCTGGAATAGTAGGGACATCTTTAAGTATACTAATTCGAATAGAACTAGGAAATCCTGGATGTTTAATTGGGGATGATCAAATTTATAATACTATTGTTACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCCATTATAATTGGAGGTTTTGGCAATTGACTTGTACCATTAATATTAGGAGCCCCTGATATAGCATTTCCCCGAATAAATAATATAAGATTTTGACTTCTTCCCCCCTCATTAATTTTATTAATTTCAAGAAGAATTGTTGAAAATGGAGCAGGAACAGGATGAACAGTCTATCCTCCATTATCTTCTAATATTGCGCATAGAGGATCCTCTGTTGATTTAGCTATTTTCTCACTTCATTTAGCAGGAATTTCTTCTATTTTAGGAGCAATTAATTTTATTACAACTATTATTAATATACGAATAAATAATTTACTTTTTGACCAAATACCTCTATTTGTTTGAGCAGTAGGTATTACAGCTGTTCTTCTTTTATTATCATTACCAGTATTAGCAGGAGCAATTACCATACTATTAACAGATCGTAATTTAAATACTTCTTTCTTTGATCCTGCTGGAGGAGGAGATCCAATTTTATACCAACATTTATTT"
by bin (a bin is a Barcode Index Number)
bold_seq(bin = 'BOLD:AAA5125')[[1]] #> $id #> [1] "ASARD6776-12" #> #> $name #> [1] "Eacles ormondei" #> #> $gene #> [1] "ASARD6776-12" #> #> $sequence #> [1] "AACTTTATATTTTATTTTTGGAATTTGAGCAGGTATAGTAGGAACTTCTTTAAGATTACTAATTCGAGCAGAATTAGGTACCCCCGGATCTTTAATTGGAGATGACCAAATTTATAATACCATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGATTAGTACCCCTAATACTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGATTTTGACTATTACCCCCATCTTTAACCCTTTTAATTTCTAGAAGAATTGTCGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCCCTTTCATCTAATATTGCTCATGGAGGCTCTTCTGTTGATTTAGCTATTTTTTCCCTTCATCTAGCTGGAATCTCATCAATTTTAGGAGCTATTAATTTTATCACAACAATCATTAATATACGACTAAATAATATAATATTTGACCAAATACCTTTATTTGTATGAGCTGTTGGTATTACAGCATTTCTTTTATTGTTATCTTTACCTGTACTAGCTGGAGCTATTACTATACTTTTAACAGATCGAAACTTAAATACATCATTTTTTGACCCAGCAGGAGGAGGAGATCCTATTCTCTATCAACATTTATTT"
And there are more ways to query, check out the docs for ?bold_seq
.
The BOLD specimen API doesn't give back sequences, only specimen data. By default you download tsv
format data, which is given back to you as a data.frame
res <- bold_specimens(taxon = 'Osmia') head(res[,1:8]) #> processid sampleid recordID catalognum fieldnum #> 1 BEECA373-06 05-NT-0373 514740 05-NT-0373 #> 2 BEECA607-06 05-NT-0607 516959 05-NT-0607 #> 3 BEECA963-07 01-OR-0790 554153 01-OR-0790 #> 4 BEECB358-07 04-WA-1076 596920 BBSL697174 04-WA-1076 #> 5 BEECB438-07 00-UT-1157 597000 BBSL432653 00-UT-1157 #> 6 BEECC1176-09 02-UT-2849 1060879 BBSL442586 02-UT-2849 #> institution_storing collection_code bin_uri #> 1 York University, Packer Collection NA BOLD:AAI2013 #> 2 York University, Packer Collection NA BOLD:AAC8510 #> 3 York University, Packer Collection NA BOLD:ABZ3184 #> 4 Utah State University, Logan Bee Lab NA BOLD:AAC5797 #> 5 Utah State University, Logan Bee Lab NA BOLD:AAF2159 #> 6 York University, Packer Collection NA BOLD:AAE5368
You can optionally get back the data in XML
format
bold_specimens(taxon = 'Osmia', format = 'xml')
<?xml version="1.0" encoding="UTF-8"?> <bold_records xsi:noNamespaceSchemaLocation="http://www.boldsystems.org/schemas/BOLDPublic_record.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <record> <record_id>1470124</record_id> <processid>BOM1525-10</processid> <bin_uri>BOLD:AAN3337</bin_uri> <specimen_identifiers> <sampleid>DHB 1011</sampleid> <catalognum>DHB 1011</catalognum> <fieldnum>DHB1011</fieldnum> <institution_storing>Marjorie Barrick Museum</institution_storing> </specimen_identifiers> <taxonomy>
You can choose to get the crul
response object back if you'd rather work with the raw data returned from the BOLD API.
res <- bold_specimens(taxon = 'Osmia', format = 'xml', response = TRUE) res$url #> [1] "https://v4.boldsystems.org/index.php/API_Public/specimen?taxon=Osmia&format=xml" res$status_code #> [1] 200 res$response_headers #> $status #> [1] "HTTP/2 200 " #> #> $server #> [1] "nginx" #> #> $date #> [1] "Mon, 20 Apr 2020 16:23:41 GMT" #> #> $`content-type` #> [1] "application/x-download" #> #> $`x-powered-by` #> [1] "PHP/5.3.15" #> #> $`content-disposition` #> [1] "attachment; filename=bold_data.xml" #> #> $`x-frame-options` #> [1] "SAMEORIGIN" #> #> $`x-content-type-options` #> [1] "nosniff" #> #> $`x-xss-protection` #> [1] "1; mode=block"
The specimen/sequence combined API gives back specimen and sequence data. Like the specimen API, this one gives by default tsv
format data, which is given back to you as a data.frame
. Here, we're setting sepfasta=TRUE
so that the sequence data is given back as a list, and taken out of the data.frame
returned so the data.frame
is more manageable.
res <- bold_seqspec(taxon = 'Osmia', sepfasta = TRUE) res$fasta[1:2] #> $`BEECA373-06` #> [1] "-ATTTTATATATAATTTTTGCTATATGATCAGGTATAATCGGATCAGCAATAAGAATTATTATTCGTATAGAATTAAGAATTCCTGGTTCATGAATTTCAAATGATCAAACTTATAACTCTTTAGTAACTGCTCATGCTTTTTTAATAATTTTTTTCTTAGTTATACCTTTTTTAATTGGAGGATTTGGAAATTGATTAATTCCTTTAATATTAGGAATCCCGGATATAGCTTTCCCTCGAATAAATAATATTAGATTTTGACTTTTACCCCCTTCATTAATATTATTACTTTTAAGAAATTTTATAAATCCAAGACCAGGTACTGGATGAACTGTTTATCCTCCTCTTTCTTCTCATTTATTTCATTCTTCTCCTTCAGTTGATATAGCCATTTTTTCTTTACATATTTCCGGTTTATCTTCTATTATAGGTTCGTTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAACATATCCAATTACCTTTATTTCCATGATCTGTTTTTATTACTACTATCTTATTACTTTTTTCTTTACCTGTTTTAGCAGGAGCTATTACTATATTATTATTTGATCGAAATTTTAATACTTCATTTTTTGATCCTACAGGAGGTGGAGATCCAATCCTTTATCAACATTTATTT" #> #> $`BEECA607-06` #> [1] "AATATTATATATAATTTTTGCTTTGTGATCTGGAATAATTGGTTCATCTATAAGAATTATTATTCGTATAGAATTAAGAATTCCTGGTTCATGAATTTCAAATGATCAAGTTTATAATTCATTAGTTACAGCTCATGCTTTTTTAATAATTTTTTTTTTAGTTATACCATTTATAATTGGAGGATTTGGAAATTGATTAGTTCCTTTAATATTAGGAATTCCTGATATAGCTTTTCCTCGAATAAATAATATTAGATTTTGATTATTACCACCATCATTAATACTTTTACTTTTAAGAAATTTTTTTAATCCAAGTTCAGGAACTGGATGAACTGTATATCCTCCTCTTTCATCATATTTATTTCATTCTTCACCTTCTGTTGATTTAGCTATTTTTTCTCTTCATATATCAGGTTTATCTTCTATTATAGGTTCATTAAACTTTATTGTAACTATTATTATAATAAAAAATATTTCTTTAAAGTATATTCAATTGCCATTATTTCCATGATCTGTTTTTATTACTACAATTCTTTTATTATTATCATTACCAGTTTTAGCAGGTGCTATTACTATATTATTATTTGATCGAAATTTTAATACTTCATTTTTTGATCCTACAGGAGGGGGAG--------------------------"
Or you can index to a specific sequence like
res$fasta['GBAH0293-06'] #> $`GBAH0293-06` #> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAATATATTCAATTACCTTTATTTTCTTGATCTGTATTTATTACTACTATTCTTTTATTATTTTCTTTACCTGTATTAGCTGGAGCTATTACTATATTATTATTTGATCGAAATTTTAATACATCTTTTTTTGATCCAACAGGAGGGGGAGATCCAATTCTTTATCAACATTTATTTTGATTTTTTGGTCATCCTGAAGTTTATATTTTAATTTTACCTGGATTTGGATTAATTTCTCAAATTATTTCTAATGAAAGAGGAAAAAAAGAAACTTTTGGAAATATTGGTATAATTTATGCTATATTAAGAATTGGACTTTTAGGTTTTATTGTT---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------"
This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information.
bold_trace(taxon = 'Osmia', quiet = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.