efetch: efetch
In gschofl/rentrez: An R interface to the NCBI EUtilities

Description Usage Arguments Details Value Class hierarchy for efetch Generics with methods for efetch Slots See Also Examples

“efetch” is an S4 class that provides a container for data retrived by calls to the NCBI EFetch utility.

efetch retrieves data records in the requested format from a character vector of one or more primary UIDs or from a set of UIDs stored in the user's web environment.

  efetch(id, db = NULL, rettype = NULL, retmode = NULL,
    retstart = NULL, retmax = NULL, query_key = NULL,
    WebEnv = NULL, strand = NULL, seq_start = NULL,
    seq_stop = NULL, complexity = NULL)

`id`	(Required) List of UIDs provided either as a character vector, as an `esearch` instance, or by reference to a web environment and a query key obtained directly from previous calls to `esearch` (if `usehistory = TRUE`), `epost` or `elink`. If UIDs are provided as a plain character vector, `db` must be specified explicitly, and all of the UIDs must be from the database specified by `db`.
`db`	(Required only when `id` is a vector of UIDs) Database from which to retrieve records. See here for the supported databases.
`rettype`	A character string specifying the retrieval type, such as 'abstract' or 'medline' from PubMed, 'gp' or 'fasta' from protein, or 'gb', 'gbwithparts, or 'fasta_cds_na' from nuccore. See here for allowed values for each database.
`retmode`	A character string specifying the data mode of the records returned, such as plain text, XML, or asn.1. See here for allowed values for each database.
`retstart`	Numeric index of the first record to be retrieved.
`retmax`	Total number of records from the input set to be retrieved.
`query_key`	An integer specifying which of the UID lists attached to a user's Web Environment will be used as input to `efetch`. (Usually obtained drectely from objects returned by previous `esearch`, `epost` or `elink` calls.)
`WebEnv`	A character string specifying the Web Environment that contains the UID list. (Usually obtained directely from objects returned by previous `esearch`, `epost` or `elink` calls.)
`strand`	Strand of DNA to retrieve. (1: plus strand, 2: minus strand)
`seq_start`	First sequence base to retrieve.
`seq_stop`	Last sequence base to retrieve.
`complexity`	Data content to return. (0: entire data structure, 1: bioseq, 2: minimal bioseq-set, 3: minimal nuc-prot, 4: minimal pub-set)

See the official online documentation for NCBI's EUtilities for additional information.

The default retrieval mode (retmode) for the pubmed, nuccore, protein, and gene databases is 'text'. Default rettypes are 'medline', 'gb', 'gp', and 'gene_table', respectively.

An efetch instance.

Super classes:

eutil

c
content
database
retmode
rettype
show
write

url: A character vector containing the query URL.
error: Any error or warning messages parsed from the output of the call submitted to Entrez.
content: A character vector holding the unparsed contents of a request to Entrez.
database: A character vector giving the name of the queried database.
rettype: Retrieval Mode. A character vector specifying the record view returned, such as ‘Abstract’ or ‘MEDLINE’ from pubmed, or ‘GenPept’ or ‘FASTA’ from protein.
retmode: Retrieval Mode. A character vector specifying the data format of the records returned, such as plain ‘text’, ‘HMTL’ or ‘XML’.

efetch.batch for downloading more than about 500 data records. content to retrieve data from efetch objects.

# Search the protein database for Chlamydia CPAF:
cpaf <- esearch("Chlamydia[organism] and CPAF", "protein")
cpaf

# Fetch the fasta sequence of the first 5 hits as TSeqSet XML data:
cpaf_fasta <- efetch(cpaf[1:5], rettype="fasta")
cpaf_fasta

# Directly download sequences using GIs:
gis <- c("84785889","84785885")
a <- efetch(gis, "nucleotide", retmode="text", rettype="fasta")

# Retrieve the downloaded record as text string:
seq <- content(a)

# Alternatively use accession numbers:
acc_no <- "AAA23146"
b <- efetch(acc_no, "protein", rettype="fasta")

# Download nucleotide GIs 84785889 and 84785885 in GenBank format (default):
gis <- c("84785889","84785885")
c <- efetch(gis, "nucleotide", rettype="gb")

# Write to file
write(c, file="~/data.gbk")

# Download data from pubmed
query <- "Chlamydia psittaci and genome and 2012[pdat]"
cpsit <- esearch(query, "pubmed", usehistory=TRUE)
publ <- efetch(cpsit)

# retrieve the xml data
publ_xml <- content(publ)