Description Usage Arguments Details Value Author(s) References Examples
Converts the output of an all-by-all blast query into a format that can be parsed by mcl to find clusters.
1 2 3 4 5 6 7 | blast_to_mcl(
path_to_ys = pkgconfig::get_config("baitfindR::path_to_ys"),
blast_results,
hit_fraction_cutoff,
echo = pkgconfig::get_config("baitfindR::echo", fallback = FALSE),
...
)
|
path_to_ys |
Character vector of length one; the complete path to the folder containing Y&S python scripts, e.g., "/Users/me/apps/phylogenomic_dataset_construction/" |
blast_results |
Character vector of length one; the complete path to the tab-separated text file containing the results from an all-by-all blast search. If blast searches were run separately (i.e., one for each sample), the results should be concatenated into a single file. For the blast search, the output format should specified as: -outfmt '6 qseqid qlen sseqid slen frames pident nident length mismatch gapopen qstart qend sstart send evalue bitscore' |
hit_fraction_cutoff |
Numeric between 0 and 1. Indicates the minimum percentage overlap between query and target in blast results to be retained in the output. According to Y&S, "A low hit-fraction cutoff will output clusters with more incomplete sequences and much larger and sparser alignments, whereas a high hit-fraction cutoff gives tighter clusters but ignores incomplete or divergent sequences." |
echo |
Logical; should the standard output and error be printed to the screen? |
... |
Other arguments. Not used by this function, but meant to be used by |
Wrapper for Yang and Smith (2014) blast_to_mcl.py
A tab-separated text file with three columns: the first two are the matching query and target from the all-by-all blast, and the third is the negative log e-value for that match. This file is named <blast_results>.hit-frac<hit_fraction_cutoff>.minusLogEvalue
, where <blast_results>
and <hit_fraction_cutoff>
correspond to the values of those arguments. If possible contaminants (i.e., identical sequences between different samples) were found, these are written to <blast_results>.ident.hit-frac<hit_fraction_cutoff>
. Output files will be written to the same folder containing blast_results
.
Joel H Nitta, joelnitta@gmail.com
Yang, Y. and S.A. Smith. 2014. Orthology inference in non-model organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Molecular Biology and Evolution 31:3081-3092. https://bitbucket.org/yangya/phylogenomic_dataset_construction/overview
1 | ## Not run: blast_to_mcl(blast_results = "some/folder/blastresults.tab", hit_fraction_cutoff = 0.5)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.