make_bold_otol_tree: Use genetic data from the Barcode of Life Database (BOLD) to...

View source: R/bold_tree.R

make_bold_otol_treeR Documentation

Use genetic data from the Barcode of Life Database (BOLD) to reconstruct branch lengths on a tree.

Description

make_bold_otol_tree takes taxon names from a tree topology or a vector of names to search for genetic markers in the Barcode of Life Database (BOLD), create an alignment, and reconstruct branch lengths on a tree topology with Maximum Likelihood.

Usage

make_bold_otol_tree(
  input = c("Rhea americana", "Struthio camelus", "Gallus gallus"),
  marker = "COI",
  otol_version = "v3",
  chronogram = TRUE,
  doML = FALSE,
  aligner = "muscle",
  ...
)

Arguments

input

One of the following:

A character vector

With taxon names as a single comma separated starting or concatenated with c().

A phylogenetic tree with taxon names as tip labels

As a phylo or multiPhylo object, OR as a newick character string.

A datelifeQuery object

An output from make_datelife_query().

marker

A character vector indicating the gene from BOLD system to be used for branch length estimation.

otol_version

Version of Open Tree of Life to use

chronogram

Default to TRUE, branch lengths returned are estimated with ape::chronoMPL(). If FALSE, branch lengths returned are estimated with phangorn::acctran() and represent relative substitution rates.

doML

Default to FALSE. If TRUE, it does a ML branch length optimization with phangorn::optim.pml().

aligner

A character vector indicating whether to use MAFFT or MUSCLE to align BOLD sequences. It is not case sensitive. Default to MUSCLE, supported using the msa package from Bioconductor, which needs to be installed using BiocManager::install().

...

Arguments passed on to get_otol_synthetic_tree

resolve

Defaults to TRUE. Whether to resolve the tree at random or not.

ott_ids

If not NULL, it takes this argument and ignores input. A numeric vector of ott ids obtained with rotl::taxonomy_taxon_info() or rotl::tnrs_match_names() or tnrs_match().

Details

If input is a phylo object or a newick string, it is used as backbone topology. If input is a character vector of taxon names, an induced synthetic OpenTree subtree is used as backbone.

Value

A phylo object. If there are enough BOLD sequences available for the input taxon names, the function returns a tree with branch lengths proportional to relative substitution rate. If not enough BOLD sequences are available for the input taxon names, the function returns the topology given as input, or a synthetic Open Tree of Life for the taxon names given in input, obtained with get_otol_synthetic_tree().


phylotastic/datelife documentation built on Jan. 17, 2024, 11:10 p.m.