read.GenBank: Read DNA Sequences from GenBank via Internet

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/read.GenBank.R

Description

This function connects to the GenBank database, and reads nucleotide sequences using accession numbers given as arguments.

Usage

1
2
read.GenBank(access.nb, seq.names = access.nb, species.names = TRUE,
             gene.names = FALSE, as.character = FALSE)

Arguments

access.nb

a vector of mode character giving the accession numbers.

seq.names

the names to give to each sequence; by default the accession numbers are used.

species.names

a logical indicating whether to attribute the species names to the returned object.

gene.names

obsolete (will be removed soon).

as.character

a logical controlling whether to return the sequences as an object of class "DNAbin" (the default).

Details

The function uses the site http://www.ncbi.nlm.nih.gov/ from where the sequences are retrieved.

If species.names = TRUE, the returned list has an attribute "species" containing the names of the species taken from the field “ORGANISM” in GenBank.

Since ape 3.6, this function retrieves the sequences in FASTA format: this is more efficient and more flexible (scaffolds and contigs can be read). The option gene.names is obsolete and will be removed; this information is also present in the description.

Setting species.names = FALSE is quite faster (could be useful if you read a series of scaffolds or contigs, or if you already have the species names).

Value

A list of DNA sequences made of vectors of class "DNAbin", or of single characters (if as.character = TRUE) with two attributes (species and description).

Author(s)

Emmanuel Paradis

See Also

read.dna, write.dna, dist.dna, DNAbin

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
## This won't work if your computer is not connected
## to the Internet

## Get the 8 sequences of tanagers (Ramphocelus)
## as used in Paradis (1997)
ref <- c("U15717", "U15718", "U15719", "U15720",
         "U15721", "U15722", "U15723", "U15724")
## Copy/paste or type the following commands if you
## want to try them.
## Not run: 
Rampho <- read.GenBank(ref)
## get the species names:
attr(Rampho, "species")
## build a matrix with the species names and the accession numbers:
cbind(attr(Rampho, "species"), names(Rampho))
## print the first sequence
## (can be done with `Rampho$U15717' as well)
Rampho[[1]]
## the description from each FASTA sequence:
attr(Rampho, "description")

## End(Not run)

ape documentation built on April 5, 2018, 1:03 a.m.