datelife_search: Core function to input a vector of species, newick string, or...

Description Usage Arguments Details Examples

View source: R/datelife.R

Description

Core function to input a vector of species, newick string, or phylo object to get a chronogram or dates back.

Usage

1
2
3
4
5
6
7
datelife_search(input = c("Rhea americana", "Pterocnemia pennata",
  "Struthio camelus"), summary_format = "phylo_all", partial = TRUE,
  use_tnrs = FALSE, approximate_match = TRUE, update_cache = FALSE,
  cache = get("opentree_chronograms"), dating_method = "PATHd8",
  summary_print = c("citations", "taxa"),
  add_taxon_distribution = c("none", "summary", "matrix"),
  get_spp_from_taxon = FALSE, verbose = FALSE, criterion = "taxa")

Arguments

input

Target taxa names as a character vector, a newick character string, or a phylo object.

summary_format

The desired output format for target chronograms (chronograms of target taxa). See details.

partial

If TRUE, use source chronograms even if they only match some of the desired taxa

use_tnrs

If TRUE, use OpenTree's services to resolve names. This can dramatically improve the chance of matches, but also take much longer.

approximate_match

If TRUE, use a slower TNRS to correct mispellings, increasing the chance of matches (including false matches).

update_cache

default to FALSE

cache

The cached set of chronograms and other info from data(opentree_chronograms).

dating_method

The method used for tree dating.

summary_print

A character vector specifying type of summary information to be printed: "citations" for the references of chronograms from cache where target taxa are found, "taxa" for a summary of the number of chronograms where each target taxon is found, or "none" if nothing should be printed. Default to display both c("citations", "taxa").

add_taxon_distribution

A character vector specifying if data on target taxa missing in source chronograms should be added to the output as a "summary" or as a presence/absence "matrix". Default to "none", no information on add_taxon_distribution added to the output.

get_spp_from_taxon

boolean vector, default to FALSE. If TRUE, will get all species names from taxon names given in input. Must have same length as input. If input is a newick string , with some clades it will be converted to phylo object phy, and the order of get_spp_from_taxon will match phy$tip.label.

verbose

Boolean. If TRUE, it gives printed updates to the user.

criterion

Whether to get the grove with the most trees or the most taxa

Details

Available output formats are:

citations: A character vector of references where chronograms with some or all of the target taxa are published (source chronograms).

mrca: A named numeric vector of most recent common ancestor (mrca) ages of target taxa defined in input, obtained from the source chronograms. Names of mrca vector are equal to citations.

newick_all: A named character vector of newick strings corresponding to target chronograms derived from source chronograms. Names of newick_all vector are equal to citations.

newick_sdm: Only if multiple source chronograms are available. A character vector with a single newick string corresponding to a target chronogram obtained with SDM supertree method (Criscuolo et al. 2006).

newick_median: Only if multiple source chronograms are available. A character vector with a single newick string corresponding to a target chronogram from the median of all source chronograms.

phylo_sdm: Only if multiple source chronograms are available. A phylo object with a single target chronogram obtained with SDM supertree method (Criscuolo et al. 2006).

phylo_median: Only if multiple source chronograms are available. A phylo object with a single target chronogram obtained from source chronograms with median method.

phylo_all: A named list of phylo objects corresponding to each target chronogram obtained from available source chronograms. Names of phylo_all list correspond to citations.

phylo_biggest: The chronogram with the most taxa. In the case of a tie, the chronogram with clade age closest to the median age of the equally large trees is returned.

html: A character vector with an html string that can be saved and then opened in any web browser. It contains a 4 column table with data on target taxa: mrca, number of taxa, citations of source chronogram and newick target chronogram.

data_frame A data frame with data on target taxa: mrca, number of taxa, citations of source chronograms and newick string.

For approaches that return a single synthetic tree, it is important that the trees leading to it form a grove (roughly, a sufficiently overlapping set of taxa between trees: see Ané et al. 2005, 10.1007/s00026-009-0017-x). In the rare case of multiple groves, should we take the one with the most trees or the most taxa?

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# obtain median ages from a set of source chronograms in newick format:
ages <- datelife_search(c("Rhea americana", "Pterocnemia pennata", "Struthio camelus",
		"Mus musculus"), summary_format="newick_median")
# save the tree in newick format
write(ages, file="some.bird.ages.txt")

# obtain median ages from a set of source chronograms in phylo format
# will produce same tree as above but in r phylo format:
ages.again <- datelife_search(c("Rhea americana", "Pterocnemia pennata", "Struthio camelus",
		"Mus musculus"), summary_format="phylo_median")
plot(ages.again)
library(ape)
ape::axisPhylo()
mtext("Time (million years ago)", side = 1, line = 2, at = (max(get("last_plot.phylo",
		envir = .PlotPhyloEnv)$xx) * 0.5))
write.tree(ages.again, file="some.bird.tree.again.txt") # saves phylo object in newick format

# obtain mrca ages and target chronograms from all source chronograms
# generate an html  output readable in any web browser:
ages.html <- datelife_search(c("Rhea americana", "Pterocnemia pennata", "Struthio camelus",
		"Mus musculus"), summary_format="html")
write(ages.html, file="some.bird.trees.html")
system("open some.bird.trees.html")

phylotastic/datelife documentation built on Jan. 22, 2019, 12:29 a.m.