ns.blast.db: Setup a database for blasting sequences

Description Usage Arguments Details Value Author(s) References Examples

Description

Prepare a database for blasting sequences. It either can be an existing database or a temporary database for the current R-session only.

Usage

1
ns.blast.db(fasta.files = "", sequences = NULL, db.dir = "", blast.path = "")

Arguments

fasta.files

An array of files which contain the fasta sequences to be put in the database

sequences

A vector of sequences to put in the database

db.dir

Location where the data base will be stored

blast.path

Location of the blast executable. Use this to parameter to specify the location of the blast tools if the location is not in the PATH-variable

Details

either db.dir, fasta.files or sequences has to be set. If only db.dir is set the previously created database in that directory is used. This database should be created using the eblast package. If db.dir is not set a temporary database is created. The list of files in fasta.files and the sequences in sequences are added to the database. Temporary databases will be removed at the end of the R session. If db.dir is set and also fasta.files or sequences the given sequences will be added permanently to the existing database.

Each time the function is called without a db.dir parameter a new temporary database is created in the tempdir(). Reuse the database if blasting many times against the same database if you don't want to fill up the disk quickly (See examples).

Value

If succesfull the function will return an database identifier which can be passed by the database parameter to the function ns.blast to identify the sequence database to blast against. FALSE will be returned if database initialization failed somehow.

Author(s)

Wim de Leeuw (w.c.deLeeuw@uva.nl)

References

Blast: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaeffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.

Examples

1
2
3
4
5
6
7
8
ts = c("sequence_1"="ATGCGCGTACATCGCCCCCCCGGGGGG","sequence_2"="TCCCCCCCCGGGGGGATCTTATATATATCCCCCGGGGG")
  # Self-self blast
  ns.blast(ts,ns.blast.db(sequences=ts))
  # better would be ...
  my.db <- ns.blast.db(sequences=ts)
  ns.blast(ts,my.db)
  # ...because we can reuse the database to blast some other sequences
  ns.blast(c("query_A"="TCGCCCCCCCGGGGGG","query_B"="GATCTTATATATATCCC"),my.db,eval=1)

UvA-MAD/SeqLibR documentation built on May 9, 2019, 9:40 p.m.