Description Usage Arguments Value Examples
BLAST results files must be in tabular format (e.g., outfmt 6). For each BLAST results file, the single top hit (sorted by ascending evalue, then descending bitscore) will be extracted from the BLAST database and written to 'out_dir'.
1 2 3 4 5 6 7 8 9 10 | extract_blast_hits(
blast_results_dir,
blast_results_pattern,
blast_cols = c("qseqid", "sseqid", "pident", "length", "mismatch", "gapopen",
"qstart", "qend", "sstart", "send", "evalue", "bitscore"),
database_path,
out_dir,
out_ext = "bestmatch.fasta",
...
)
|
blast_results_dir |
Path to folder containing BLAST results files. |
blast_results_pattern |
Optional; pattern used for matching with grep. Only files with names matching the pattern will be used to extract top blast hits. |
blast_cols |
Character vector; column names of BLAST results. See https://www.ncbi.nlm.nih.gov/books/NBK279684/ (Table C1) for details. Must include at least 'sseqid', 'evalue', and 'bitscore'. Defaults to standard columns for output format 6. |
database_path |
Path to the BLAST database, including the database name. |
out_dir |
Path to folder to write results. |
out_ext |
File extension used when writing out top BLAST hit. |
... |
Additional other arguments. Not used by this function,
but meant to be used by |
TRUE when runs successfully; externally, the top blast hit for each fasta result will be written to 'out_dir'.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | library(ape)
# Make temp dir for storing files
temp_dir <- fs::dir_create(fs::path(tempdir(), "baitfindR_example"))
# Write out ape::woodmouse dataset as DNA
data(woodmouse)
ape::write.FASTA(woodmouse, fs::path(temp_dir, "woodmouse.fasta"))
ape::write.FASTA(woodmouse, fs::path(temp_dir, "woodmouse2.fasta"))
# Make blast database
build_blast_db(
fs::path(temp_dir, "woodmouse.fasta"),
db_type = "nucl",
out_name = "wood",
parse_seqids = TRUE,
wd = temp_dir)
# Blast the original sequences against the database
blast_n_list(
fasta_folder = temp_dir,
fasta_pattern = "fasta",
database_path = fs::path(temp_dir, "wood")
)
# Extract the top BLAST hit for each fasta file.
extract_blast_hits(
blast_results_dir = temp_dir,
blast_results_pattern = "\\.tsv$",
database_path = fs::path(temp_dir, "wood"),
out_dir = temp_dir
)
list.files(temp_dir)
ape::read.FASTA(fs::path(temp_dir, "woodmouse.tsv.bestmatch.fasta"))
# Cleanup.
fs::file_delete(temp_dir)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.