Retrieve gene sequences from NCBI by taxon name and gene names.

Share:

Description

Retrieve gene sequences from NCBI by taxon name and gene names.

Usage

1
2
ncbi_byname(taxa, gene = "COI", seqrange = "1:3000", getrelated = FALSE,
  verbose = TRUE)

Arguments

taxa

(character) Scientific name to search for.

gene

(character) Gene or genes (in a vector) to search for. See examples.

seqrange

(character) Sequence range, as e.g., "1:1000". This is the range of sequence lengths to search for. So "1:1000" means search for sequences from 1 to 1000 characters in length.

getrelated

(logical) If TRUE, gets the longest sequences of a species in the same genus as the one searched for. If FALSE, returns nothing if no match found.

verbose

(logical) If TRUE (default), informative messages printed.

Details

Removes predicted sequences so you don't have to remove them. Predicted sequences are those with accession numbers that have "XM_" or "XR_" prefixes. This function retrieves one sequences for each species, picking the longest available for the given gene.

Value

Data.frame of results.

Author(s)

Scott Chamberlain myrmecocystus@gmail.com

See Also

ncbi_search, ncbi_getbyid

Examples

1
2
3
4
5
6
7
8
9
## Not run: 
# A single species
ncbi_byname(taxa="Acipenser brevirostrum")

# Many species
species <- c("Colletes similis","Halictus ligatus","Perdita trisignata")
ncbi_byname(taxa=species, gene = c("coi", "co1"), seqrange = "1:2000")

## End(Not run)