Description Usage Arguments Details Value Author(s) References Examples
Given a folder containing unaligned sequences in fasta format (i.e., clusters),
aligns each cluster with mafft
(small clusters) or pasta
(large clusters),
excludes poorly aligned sites with phyutility
, and infers a maximum-likelihood
tree with RAxML
(small clusters) or fasttree
(large clusters). Requires all
of these programs to be installed and included in the user's $PATH
. Assumes clusters are named "cluster1.fa"
, "cluster2.fa"
, etc. Clusters with fewer than 1,000 sequences are considered "small," and those with more are considered "large."
1 2 3 4 5 6 7 8 9 10 11 |
path_to_ys |
Character vector of length one; the path to the folder containing Y&S python scripts, e.g., |
seq_folder |
Character vector of length one; the path to the folder containing the fasta files. |
number_cores |
Numeric; number of threads to use for |
seq_type |
Character vector of length one indicating type of sequences. Should either be |
bootstrap |
Logical; should run a bootstrap analysis be run for the trees? |
overwrite |
Logical; should previous output of this command be erased so new output can be written? Once erased it cannot be restored, so use with caution! |
get_hash |
Logical; should the 32-byte MD5 hash be computed for all output tree files concatenated together? Used for by |
echo |
Logical; should the standard output and error be printed to the screen? |
... |
Other arguments. Not used by this function, but meant to be used by |
Wrapper for Yang and Smith (2014) fasta_to_tree.py
For each input cluster cluster1.fa
in seq_folder
, cluster1.fa.mafft.aln
(small clusters) or cluster1.pasta.aln
(large clusters), cluster1.fa.mafft.aln-cln
(small clusters) or cluster1.fa.pasta.aln-cln
(large clusters), and cluster1.raxml.tre
(small clusters) or cluster1.fasttree.tre
(large clusters) will be written to seq_folder
. If get_hash
is TRUE
, the 32-byte MD5 hash be computed for all .tre
files concatenated together will be returned.
Joel H Nitta, joelnitta@gmail.com
Yang, Y. and S.A. Smith. 2014. Orthology inference in non-model organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Molecular Biology and Evolution 31:3081-3092. https://bitbucket.org/yangya/phylogenomic_dataset_construction/overview
1 | ## Not run: fasta_to_tree(seq_folder = "some/folder/containing/fasta/seqs", number_cores = 1, seq_type = "dna", bootstrap = FALSE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.