prune_paralogs_RT: Identify orthologs using the "RT" method.

Description Usage Arguments Details Value Author(s) References Examples

Description

Given a folder containing homolog trees, prune paralogs from the trees using the rooted ingroups (RT) method. For trees that have outgroups, this method iteratively extracts subtrees with the highest number of ingroup taxa/samples. This function will overwrite any output files with the same name in outdir.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
prune_paralogs_RT(
  path_to_ys = pkgconfig::get_config("baitfindR::path_to_ys"),
  tree_folder,
  tree_file_ending,
  ingroup,
  outgroup,
  min_ingroup_taxa = 2,
  outdir,
  overwrite = FALSE,
  get_hash = TRUE,
  echo = pkgconfig::get_config("baitfindR::echo", fallback = FALSE),
  ...
)

Arguments

path_to_ys

Character vector of length one; the path to the folder containing Y&S python scripts, e.g., "/Users/me/apps/phylogenomic_dataset_construction/"

tree_folder

Character vector of length one; the path to the folder containing the trees to be used for pruning.

tree_file_ending

Character vector of length one; only tree files with this file ending will be used.

ingroup

Character vector; names of ingroup taxa/samples.

outgroup

Character vector; names of outgroup taxa/samples.

min_ingroup_taxa

Numeric; minimal number of taxa in the ingroup required for an ortholog to be written. Default 2.

outdir

Character vector of length one; the path to the folder where the pruned trees should be written.

overwrite

Logical; should previous output of this command be erased so new output can be written? Once erased it cannot be restored, so use with caution!

get_hash

Logical; should the 32-byte MD5 hash be computed for all pruned tree files concatenated together? Used for by drake_plan for tracking during workflows. If TRUE, this function will return the hash.

echo

Logical; should the standard output and error be printed to the screen?

...

Other arguments. Not used by this function, but meant to be used by drake_plan for tracking during workflows.

Details

Wrapper for Yang and Smith (2014) prune_paralogs_RT.py

Value

For each tree file ending in tree_file_ending in tree_folder, the following outputs are possible depending on the presence of outgroups in the homolog tree:

No outgroups in homolog tree

Unrooted ingroup clades without duplications (files ending in unrooted-ortho.tre)

Outgroups present in homolog tree

Rooted ingroup clades (files ending in inclade) and one or more paralogs (files ending in inclade.ortho.tre)

If get_hash is TRUE, the 32-byte MD5 hash be computed for all extracted tree files concatenated together will be returned.

Author(s)

Joel H Nitta, joelnitta@gmail.com

References

Yang, Y. and S.A. Smith. 2014. Orthology inference in non-model organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Molecular Biology and Evolution 31:3081-3092. https://bitbucket.org/yangya/phylogenomic_dataset_construction/overview

Examples

1
## Not run: prune_paralogs_RT(tree_folder = "some/folder/containing/tree/files", tree_file_ending = ".tre", outgroup = c("ABC", "EFG"), ingroup = c("HIJ", "KLM"), outdir = "some/folder")

joelnitta/baitfindR documentation built on May 7, 2020, 6:21 p.m.