get_seq_for_DB: Get nucleotide sequences from NCBI

Description Usage Arguments Details Value Functions Author(s) Examples

Description

Retrieves nucleotide sequences from NCBI for given identification numbers.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
get_seq_for_DB(
  ids,
  db,
  check.result = FALSE,
  return = "data.frame",
  fasta.file = NULL,
  exclude.from.download = FALSE,
  exclude.var,
  exclude.pattern,
  exclude.fixed = TRUE,
  verbose = TRUE
)

get_seq_for_DB_fix(res.data, db, verbose = TRUE)

Arguments

ids

vector of NCBI sequences' identification numbers: GenBank accession numbers, GenInfo identifiers (GI) or Entrez unique identifiers (UID)

db

character; NCBI database for search. See entrez_dbs() for possible values

check.result

logical; check if download was done correctly

return

character; sequence returned object; possible values are "vector", "data.frame" and "fasta"

fasta.file

character; FASTA file name and path, only used if return = "fasta"

exclude.from.download

logical; ignore some sequences while downloading

exclude.var

vector that is used to define which sequences should be ignored, only used if exclude.from.download = TRUE.

exclude.pattern

value that matches to exclude.var and marks unwanted sequences, only used if exclude.from.download = TRUE

exclude.fixed

logical; match exclude.pattern as is, only used if exclude.from.download = TRUE.

verbose

logical; show messages

res.data

data.frame; data frame of nucleotide ids and previously downloaded sequences

Details

Master records (for example, in WGS-project) do not contain any nucleotide. They might be excluded from download with exclude.from.download parameters. However this has no affect and such ids do not have to be excluded when loading.

If no fasta.file value is provided, "seq.fasta" file is created in working directory. If writing FASTA to existing FASTA file, sequences are appended.

Value

If return = "vector" function returns vector of nucleotide sequences, return = "data.frame" - data frame with nucleotide ids and nucleotide sequences, return = "fasta" - writes FASTA file, no data returned.

Functions

Author(s)

Elena N. Filatova

Examples

1
2
3
4
5
ids<-c(2134240466, 2134240465, 2134240464)
fasta.file<-tempfile()
get_seq_for_DB (ids = ids, db = "nucleotide", check.result = TRUE,
                return = "fasta", fasta.file = fasta.file, exclude.from.download=FALSE)
file.remove(fasta.file)

disprose documentation built on Jan. 6, 2022, 1:07 a.m.