Home

/

GitHub

/

joelnitta/gbfetch

/

fetch_metadata: Fetch metadata from GenBank

fetch_metadata: Fetch metadata from GenBank
In joelnitta/gbfetch: Fetch sequences from GenBank into R

View source: R/fetch_metadata.R

fetch_metadata

R Documentation

Fetch metadata from GenBank

Description

Sequences downloaded from GenBank with fetch_sequences only include the title and/or accession number. Use fetch_metadata to obtain other useful metadata associated with the sequences.

Usage

fetch_metadata(query, chunk_size = 10, max_tries = 10,
  verbose = FALSE, higher_taxa = FALSE)

Arguments

`query`	String used to query NCBI GenBank. For more about the NCBI query format see https://www.ncbi.nlm.nih.gov/books/NBK3837/#EntrezHelp.Entrez_Searching_Options
`chunk_size`	Number of ids to use for each chunk. Changing this doesn't tend to affect the results, but lower values have more accurate progress bars.
`max_tries`	Maximum number of times to attempt the loop.
`verbose`	Logical; should information about number of loops attempted be printed to the screen
`higher_taxa`	Logical; should higher taxonomic ranks (family and order) be included in the results?

Details

entrez_search is used to obtain a vector of IDs from the 'query', then entrez_summary is used to download metadata from the IDs. However, entrez_summary will fail if too many IDs are used as input (more than 200-300 or so). Therefore, fetch_metadata splits the IDs into chunks (a list of vectors), and loops over the list.

Sometimes errors are encountered during the loop due to the API rejecting the request, internet connectivity, etc. To avoid this, the loop repeats until it finishes or the number of repeats reaches 'max_tries', upon which it quits with an error.

Value

Tibble of metadata resulting from Genbank query. Columns include:

gi: Genbank GI number
accession: Genbank accession number
taxid: Taxon ID (can use to query with taxize)
title: Sequence title
slen: Sequence length
subname: Misc. data (specimen, collection country, etc), separated by |
subtype: Column names of misc. data, separated by |
species: Species name

Examples

## Not run: 
fetch_metadata("rbcl[Gene] AND Crepidomanes[ORGN]")

## End(Not run)

joelnitta/gbfetch documentation built on March 2, 2024, 7:03 p.m.

joelnitta/gbfetch index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

joelnitta/gbfetch
Fetch sequences from GenBank into R

fetch_metadata: Fetch metadata from GenBank
In joelnitta/gbfetch: Fetch sequences from GenBank into R

Fetch metadata from GenBank

Description

Usage

Arguments

Details

Value

Examples

Related to fetch_metadata in joelnitta/gbfetch...

R Package Documentation

Browse R Packages

We want your feedback!

joelnitta/gbfetch Fetch sequences from GenBank into R

fetch_metadata: Fetch metadata from GenBank In joelnitta/gbfetch: Fetch sequences from GenBank into R

Fetch metadata from GenBank

Description

Usage

Arguments

Details

Value

Examples

Related to fetch_metadata in joelnitta/gbfetch...

R Package Documentation

Browse R Packages

We want your feedback!

joelnitta/gbfetch
Fetch sequences from GenBank into R

fetch_metadata: Fetch metadata from GenBank
In joelnitta/gbfetch: Fetch sequences from GenBank into R