blastSequences: Run a blast query to NCBI for either a string or an entrez...

View source: R/blastSequences.R

blastSequencesR Documentation

Run a blast query to NCBI for either a string or an entrez gene ID and then return a series of MultipleAlignment objects.

Description

This function sends a query to NCBI as a string of sequence or an entrez gene ID and then returns a series of MultipleAlignment objects.

Usage

  blastSequences(x, database, hitListSize, filter, expect, program,
      timeout=40, as=c("DNAMultipleAlignment", "data.frame", "XML"))

Arguments

x

A sequence as a character vector or an integer corresponding to an entrez gene ID. Submit multiple sequences as a length-1 character vector, x = ">ID-1\nACATGCTA\n>ID-2\nAAACCACTT".

database

Which NCBI database to use. If not “blastn”, then set as="XML"

hitListSize

Number of hits to keep.

filter

Sequence filter; “L” for Low Complexity, “R” for Human Repeats, “m” for Mask lookup

expect

The BLAST ‘expect’ value above which matches will be returned.

program

Which program do you want to use for blast.

timeout

Approximate maximum length of time, in seconds, to wait for a result.

as

character(1) indicating whether the result from the NCBI server should be parsed to a list of DNAMultipleAlignment instances, represented as a data.frame, or returned as XML.

Details

Right now the function only works for "blastn".

The NCBI URL api used by this function is documented at https://www.ncbi.nlm.nih.gov/blast/Doc/urlapi.html

Value

By default, a series of DNAMultipleAlignment (see MultipleAlignment-class objects. Alternatively, a data.frame or XML document returned from the NCBI server. The data.frame is a ‘long form’ representation of the ‘Iteration’, ‘Hit’ and ‘Hsp’ results returned from the server. The XML document is the result of the xmlParse function of the XML library, and follows the format described by https://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd and https://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.mod.dtd.

Author(s)

M. Carlson

Examples


## x can be an entrez gene ID
blastSequences(17702, timeout=40, as="data.frame")

if (interactive()) {

    ## or x can be a sequence
    blastSequences(x = "GGCCTTCATTTACCCAAAATG")

    ## hitListSize does not promise that you will get the number of
    ## matches you want..  It will just try to get that many.
    blastSequences(x = "GGCCTTCATTTACCCAAAATG", hitListSize="20")

}

Bioconductor/annotate documentation built on Nov. 2, 2024, 4:40 p.m.