blast_n_list: Run blastn on all fasta files in a folder.

Description Usage Arguments Value Author(s) References Examples

Description

Output is written to the same folder containing the input files.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
blast_n_list(
  fasta_folder,
  fasta_pattern,
  database_path,
  out_ext = "tsv",
  outfmt = "6",
  other_args = NULL,
  overwrite = FALSE,
  echo = FALSE,
  get_hash = TRUE,
  ...
)

Arguments

fasta_folder

Path to the folder containing fasta files to BLAST.

fasta_pattern

Optional; pattern used for matching with grep. Only files with names matching the pattern will be included in the BLAST search.

database_path

Path to the BLAST database, including the database name.

out_ext

File extension used for BLAST results files. The result of each BLAST search will be a file with the same name as the input fasta files, but with this extension appended.

outfmt

String; format to use for BLAST output. See https://www.ncbi.nlm.nih.gov/books/NBK279684/ (Table C1) for details.

other_args

Character vector; other arguments to pass on to blastn. For a list of options, run blastn -help.

overwrite

Logical: should old output be erased before running this function? "Old output" will be determined by matching any file names with 'out_ext'.

echo

Logical; should standard error and output be printed?

get_hash

Logical; if TRUE, the MD5 hash of the output will be returned.

...

Additional other arguments. Not used by this function, but meant to be used by drake_plan for tracking during workflows.

Value

NULL or character vector if 'get_hash' is TRUE. Externally, a text file file with the results of the blastn search, named by adding 'out_ext' to each input fasta file name.

Author(s)

Joel H Nitta, joelnitta@gmail.com

References

https://www.ncbi.nlm.nih.gov/books/NBK279690/

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
library(ape)

# Make temp dir for storing files
temp_dir <- fs::dir_create(fs::path(tempdir(), "baitfindR_example"))

# Write out ape::woodmouse dataset as DNA
data(woodmouse)
ape::write.FASTA(woodmouse, fs::path(temp_dir, "woodmouse.fasta"))
ape::write.FASTA(woodmouse, fs::path(temp_dir, "woodmouse2.fasta"))

# Make blast database
build_blast_db(
  fs::path(temp_dir, "woodmouse.fasta"),
  db_type = "nucl",
  out_name = "wood",
  parse_seqids = TRUE,
  wd = temp_dir)

# Blast the original sequences against the database
blast_n_list(
  fasta_folder = temp_dir,
  fasta_pattern = "fasta",
  database_path = fs::path(temp_dir, "wood")
)

# Take a look at the results.
readr::read_tsv(
  fs::path(temp_dir, "woodmouse.tsv"),
  col_names = FALSE
  )

readr::read_tsv(
  fs::path(temp_dir, "woodmouse2.tsv"),
  col_names = FALSE
  )

# Cleanup.
fs::file_delete(temp_dir)

joelnitta/baitfindR documentation built on May 7, 2020, 6:21 p.m.