submit_NCBI_BLAST: Submits a list of DNA sequences to NCBI BLAST for alignment...

View source: R/submit_NCBI_BLAST_request.R

submit_NCBI_BLASTR Documentation

Submits a list of DNA sequences to NCBI BLAST for alignment to a sequence database.

Description

Submits a list of DNA sequences to NCBI BLAST for alignment to a sequence database.

Usage

submit_NCBI_BLAST(
  seq.list,
  res.dir,
  delay.req = 10,
  email,
  db = "genomic/9606/GCF_000001405.39"
)

Arguments

seq.list

A named list of DNA sequences stored as character strings to be BLASTed on NCBI servers databases.

res.dir

A character string specifying the directory where NCBI BLAST submission results will be stored. Each BLAST result is stored in a separate folder within the given directory. Blast hits for each submission are available as a XML file in each matching folder.

delay.req

Mandatory seconds between BLAST request submissions as an integer (Default: delay.req = 10).

email

Mandatory E-mail adress for NCBI BLAST request, specified as a character string.

db

A character string specifying an NCBI genome or reference sequence set database on which the submitted sequences will be BLASTed (Default using GRCh38 (hg38) genome assembly database: db = "genomic/9606/GCF_000001405.39"). For more supported databases see Details.

Details

Other supported NCBI genome databases include:

  • Human GRCh37 (hg19): db = "genomic/9606/GCF_000001405.25"

  • Mus musculus MGSCv37 (mm9): db = "genomic/10090/GCF_000001635.18"

  • Mus musculus GRCm38.p6 (mm10): db = "genomic/10090/GCF_000001635.26"

  • Mus musculus GRCm39 (mm39): db = "genomic/10090/GCF_000001635.27"

  • Drosophila melanogaster (fruit fly) Release 6 plus ISO1 MT: db = "genomic/7227/GCF_000001215.4"

  • Danio rerio (zebrafish) GRCz11: db = "genomic/7955/GCF_000002035.6"

You can try other db strings based on the following database name structure:
db = "genomic/{taxonomy ID}/{RefSeq GCF assembly accession ID}"

Once you start the submission, some logs will appear in the console.
The unique submission ID will be displayed on one line such as "Run G01J99FG013 : 00:00:01".
If you wish to track your submission directly from the NCBI BLAST web interface, you can go here and paste your submission ID (in this case: G01J99FG013) in the "Request ID" field.
If your submission is done processing, it will give you access to all results, and other file formats to save them.

Author(s)

Yoann Pageaud.

References

Examples

#Create example list of sequences you wish to BLAST using NCBI BLAST API.
ls.seq <- list(
  "7qtel" = "CCCTAACACTGTTAGGGTTATTATGTTGACTGTTCTCATTGCTGTCTTAG",
  "1ptel" = "GATCCTTGAAGCGCCCCCAAGGGCATCTTCTCAAAGTTGGATGTGTGCAT",
  "17qtel" = "CCCTAACCCTAAACCCTAGCCCTAGCCCTAGCCCTAGCCCTAGCCCTAGC")
#Submit the list of sequences to NCBI BLAST using the GRCh38/hg38 genome assembly database.
submit_NCBI_BLAST(
  seq.list = ls.seq, res.dir = "~/result_directory",
  email = "myemailadress@dkfz.de", db = "genomic/9606/GCF_000001405.39")

YoannPa/NCBI.BLAST2DT documentation built on July 1, 2023, 1:03 a.m.