mask_tips_by_taxonID_transcripts: Mask tips in tree.

Description Usage Arguments Details Value Author(s) References Examples

Description

Given a folder containing phylogenetic trees and their alignments, mask monophyletic (and optionally, paraphyletic) tips belonging to the same taxon (i.e., keep only a single tip to represent clades consisting of a single taxon). Tree files are assumed to end in .tt (the output of trim_tips), and only tree files with this ending will be included. Alignment files are assumed to end in .aln-cln (the output of fasta_to_tree), and only alignment files with this ending will be included. The tip with the fewest ambiguous characters in the alignment will be kept. This function will overwrite any output files with the same name in tree_folder.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mask_tips_by_taxonID_transcripts(
  path_to_ys = pkgconfig::get_config("baitfindR::path_to_ys"),
  tree_folder,
  aln_folder,
  mask_paraphyletic = TRUE,
  overwrite = FALSE,
  get_hash = TRUE,
  echo = pkgconfig::get_config("baitfindR::echo", fallback = FALSE),
  ...
)

Arguments

path_to_ys

Character vector of length one; the path to the folder containing Y&S python scripts, e.g., "/Users/me/apps/phylogenomic_dataset_construction/"

tree_folder

Character vector of length one; the path to the folder containing the trees to mask.

aln_folder

Character vector of length one; the path to the folder containing the alignments used to make the trees.

mask_paraphyletic

Logical; should paraphyletic tips belonging to the same taxon be masked?

overwrite

Logical; should previous output of this command be erased so new output can be written? Once erased it cannot be restored, so use with caution!

get_hash

Logical; should the 32-byte MD5 hash be computed for all output masked tree files concatenated together? Used for by drake_plan for tracking during workflows. If TRUE, this function will return the hash.

echo

Logical; should the standard output and error be printed to the screen?

...

Other arguments. Not used by this function, but meant to be used by drake_plan for tracking during workflows.

Details

Wrapper for Yang and Smith (2014) mask_tips_by_taxonID_transcripts.py

Value

For each input tree with a file ending in .tt in tree_folder, a trimmed tree with a file ending in .mm will be written to tree_folder. If get_hash is TRUE, the 32-byte MD5 hash be computed for all masked tree files concatenated together will be returned.

Author(s)

Joel H Nitta, joelnitta@gmail.com

References

Yang, Y. and S.A. Smith. 2014. Orthology inference in non-model organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Molecular Biology and Evolution 31:3081-3092. https://bitbucket.org/yangya/phylogenomic_dataset_construction/overview

Examples

1
## Not run: mask_tips_by_taxonID_transcripts(tree_folder = "some/folder/containing/tree/files", aln_folder = "some/folder/containing/alignment/files")

joelnitta/baitfindR documentation built on May 7, 2020, 6:21 p.m.