ncbi_byname: Retrieve gene sequences from NCBI by taxon name and gene...

View source: R/ncbi_byname.R

ncbi_bynameR Documentation

Retrieve gene sequences from NCBI by taxon name and gene names.

Description

Retrieve gene sequences from NCBI by taxon name and gene names.

Usage

ncbi_byname(
  taxa,
  gene = "COI",
  seqrange = "1:3000",
  getrelated = FALSE,
  verbose = TRUE,
  batch_size = 100,
  ...
)

Arguments

taxa

(character) Scientific name to search for.

gene

(character) Gene or genes (in a vector) to search for. See examples.

seqrange

(character) Sequence range, as e.g., "1:1000". This is the range of sequence lengths to search for. So "1:1000" means search for sequences from 1 to 1000 characters in length.

getrelated

(logical) If TRUE, gets the longest sequences of a species in the same genus as the one searched for. If FALSE, returns nothing if no match found.

verbose

(logical) If TRUE (default), informative messages printed.

batch_size

An integer specifying the number of names to query per batch.

...

Curl options passed on to crul::verb-GET

Details

Removes predicted sequences so you don't have to remove them. Predicted sequences are those with accession numbers that have "XM_" or "XR_" prefixes. This function retrieves one sequences for each species, picking the longest available for the given gene.

Value

data.frame

Author(s)

Scott Chamberlain

See Also

ncbi_searcher(), ncbi_byid()

Examples

# Empty input returns immediately (no HTTP request)
ncbi_byname(taxa = character())


if (interactive()) {
  # A single species
  ncbi_byname(taxa = "Acipenser brevirostrum")

  # Many species
  species <- c("Colletes similis", "Halictus ligatus", "Perdita californica")
  ncbi_byname(taxa = species, gene = c("coi", "co1"), seqrange = "1:2000")
}


traits documentation built on April 10, 2026, 5:06 p.m.