| run_deepSTRAPP_for_focal_time | R Documentation |
Wrapper function to run deepSTRAPP workflow for a given point in the past (i.e. the focal_time).
It starts from traits mapped on a phylogeny (trait data) and BAMM output (diversification data)
and carries out the appropriate statistical method to test for a relationship between diversification rates and trait data.
Tests are based on block-permutations: rates data are randomized across tips following blocks
defined by the diversification regimes identified on each tip (typically from a BAMM).
Such tests are called STructured RAte Permutations on Phylogenies (STRAPP) as described in Rabosky, D. L., & Huang, H. (2016). A robust semi-parametric test for detecting trait-dependent diversification. Systematic biology, 65(2), 181-193. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/sysbio/syv066")}.
See the original BAMMtools::traitDependentBAMM() function used to
carry out STRAPP test on extant time-calibrated phylogenies.
Tests can be carried out on speciation, extinction and net diversification rates.
run_deepSTRAPP_for_focal_time(
contMap = NULL,
densityMaps = NULL,
ace = NULL,
tip_data = NULL,
trait_data_type,
BAMM_object,
focal_time,
keep_tip_labels = TRUE,
rate_type = "net_diversification",
seed = NULL,
nb_permutations = NULL,
replace_samples = FALSE,
alpha = 0.05,
two_tailed = TRUE,
one_tailed_hypothesis = NULL,
posthoc_pairwise_tests = FALSE,
p.adjust_method = "none",
return_perm_data = FALSE,
nthreads = 1,
print_hypothesis = TRUE,
extract_diversification_data_melted_df = FALSE,
return_updated_trait_data_with_Map = FALSE,
return_updated_BAMM_object = FALSE,
verbose = TRUE
)
contMap |
For continuous trait data. Object of class |
densityMaps |
For categorical trait or biogeographic data. List of objects of class |
ace |
(Optional) Ancestral Character Estimates (ACE) at the internal nodes.
Obtained with
|
tip_data |
(Optional) Named vector of tip values of the trait.
|
trait_data_type |
Character string. Specify the type of trait data. Must be one of "continuous", "categorical", "biogeographic". |
BAMM_object |
Object of class |
focal_time |
Numerical. The time, in terms of time distance from the present, at which data must be extracted and the phylogeny and mappings must be cut. It must be smaller than the root age of the phylogeny. |
keep_tip_labels |
Logical. Specify whether terminal branches with a single descendant tip
must retained their initial |
rate_type |
A character string specifying the type of diversification rates to use. Must be one of 'speciation', 'extinction' or 'net_diversification' (default). |
seed |
Integer. Set the seed to ensure reproducibility. Default is |
nb_permutations |
Integer. To select the number of random permutations to perform during the tests. If NULL (default), all posterior samples will be used once. |
replace_samples |
Logical. To specify whether to allow 'replacement' (i.e., multiple use) of a posterior sample
when drawing samples used to carry out the STRAPP test. Default is |
alpha |
Numerical. Significance level to use to compute the |
two_tailed |
Logical. To define the type of tests. If
|
one_tailed_hypothesis |
A character string specifying the alternative hypothesis in the one-tailed test. For continuous data, it is either "negative" or "positive" correlation. For binary data, it lists the trait states with states ordered in increasing rates under the alternative hypothesis, separated by a greater-than such as c('A > B'). |
posthoc_pairwise_tests |
Logical. Only for multinominal data (with more than two states). If |
p.adjust_method |
A character string. Only for multinominal data (with more than two states). It specifies the type of correction to apply to the p-values
in the post hoc pairwise tests to account for multiple comparisons. See |
return_perm_data |
Logical. Whether to return the stats data computed from the posterior samples for observed and permuted data in the output.
This is needed to plot the histogram of the null distribution used to assess significance of the test with |
nthreads |
Integer. Number of threads to use for paralleled computing of the STRAPP tests across the permutations.
The R package |
print_hypothesis |
Logical. Whether to print information on what test is carried out, detailing the null and alternative hypotheses,
and what significant level is used to rejected or not the null hypothesis. Default is |
extract_diversification_data_melted_df |
Logical. Specify whether diversification data (regimes ID and tip rates) must be extracted from the |
return_updated_trait_data_with_Map |
Logical. Specify whether the |
return_updated_BAMM_object |
Logical. Specify whether the |
verbose |
Logical. Should progression be displayed? A message will be printed at each stepof the deepSTRAPP workflow,
and for every batch of 100 BAMM posterior samples whose rates are regimes are updated, and optionally extracted in a melted data.frame
(if |
The function encapsulates several functions carrying out each step of the deepSTRAPP workflow:
extract_most_likely_trait_values_for_focal_time() extracts the most likely trait values
found along branches at the focal_time.
Optionally, the function can update the mapped phylogeny (contMap/densityMaps) such as
branches overlapping the focal_time are shorten to the focal_time, and
the trait mapping for the cut off branches are removed
by updating the $tree$maps and $tree$mapped.edge elements.
update_rates_and_regimes_for_focal_time() updates the BAMM_object to obtain
the diversification rates/regimes found along branches the focal_time.
Optionally, the function can update the BAMM_object to display a mapped phylogeny
such as branches overlapping the focal_time are shorten to the focal_time
If requested (extract_diversification_data_melted_df = TRUE), extract_diversification_data_melted_df_for_focal_time()
will be used to extract regimes ID and tip rates from the updated_BAMM_object and provide a melted data.frame summarizing the diversification data
as found on the phylogeny for the focal_time.
compute_STRAPP_test_for_focal_time() carries out the appropriate statistical method to test for
a relationship between diversification rates and trait data for a given point in the past (i.e. the focal_time).
It can handle three types of statistical tests depending on the type of trait data provided:
Continuous trait data: Test for correlations with the Spearman's rank correlation test (See stats::cor.test).
Binary trait data (two states): Test for differences in rates between states with the Mann-Whitney-Wilcoxon rank-sum test (See stats::wilcox.test).
Multinominal trait data (More than two states): Test for differences in rates across all states with the Kruskal-Wallis H test (See stats::kruskal.test).
If posthoc_pairwise_tests = TRUE, Dunn's post hoc pairwise rank-sum tests between pairs of states will be carried out too (See dunn.test::dunn.test).
The function returns a list with at least two elements.
$STRAPP_results List with at least eight elements summarizing the results of the STRAPP tests.
See compute_STRAPP_test_for_focal_time() for a detailed description of the output.
$focal_time Integer. The time, in terms of time distance from the present, at which the data were extracted and the STRAPP test carried out.
Optional formatted output:
$diversification_data_df A data.frame with six columns summarizing the diversification data as found on the phylogeny for the focal_time.
See extract_diversification_data_melted_df_for_focal_time() for a detailed description of the output.
Optional data updated for the focal_time:
$updated_trait_data_with_Map A list with four elements that contains trait data found at the focal_time and an updated contMap or densityMaps
that can be used as input of plot_contMap() or plot_densityMaps_overlay() to display a phylogeny mapped with trait values/states/ranges with branches cut at the focal_time.
See extract_most_likely_trait_values_for_focal_time() for a detailed description of the output.
$updated_BAMM_object An updated BAMM_object of class "bammdata" that contains rates and regimes ID found at the focal_time.
Can be used as input of plot_BAMM_rates() to display a phylogeny mapped with diversification rates with branches cut at the focal_time.
See update_rates_and_regimes_for_focal_time() for a detailed description of the output.
Maël Doré
extract_most_likely_trait_values_for_focal_time() update_rates_and_regimes_for_focal_time()
extract_diversification_data_melted_df_for_focal_time() compute_STRAPP_test_for_focal_time()
For a guided tutorial on complete deepSTRAPP workflow, see the associated vignettes:
For continuous trait data: vignette("deepSTRAPP_continuous_data", package = "deepSTRAPP")
For categorical trait data: vignette("deepSTRAPP_categorical_3lvl_data", package = "deepSTRAPP")
For biogeographic range data: vignette("deepSTRAPP_biogeographic_data", package = "deepSTRAPP")
if (deepSTRAPP::is_dev_version())
{
# ----- Example 1: Continuous trait ----- #
## Load data
# Load trait df
data(Ponerinae_trait_tip_data, package = "deepSTRAPP")
# Load phylogeny with old calibration
data(Ponerinae_tree_old_calib, package = "deepSTRAPP")
# Load the BAMM_object summarizing 1000 posterior samples of BAMM
data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP")
## This dataset is only available in development versions installed from GitHub.
# It is not available in CRAN versions.
# Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version.
## Prepare trait data
# Extract continuous trait data as a named vector
Ponerinae_cont_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cont_tip_data,
nm = Ponerinae_trait_tip_data$Taxa)
# Select a color scheme from lowest to highest values
color_scale = c("darkgreen", "limegreen", "orange", "red")
# (May take several minutes to run)
# Get Ancestral Character Estimates based on a Brownian Motion model
# To obtain values at internal nodes
Ponerinae_ACE <- phytools::fastAnc(tree = Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data)
# Run a Stochastic Mapping based on a Brownian Motion model
# to interpolate values along branches and obtain a "contMap" object
Ponerinae_contMap <- phytools::contMap(Ponerinae_tree_old_calib, x = Ponerinae_cont_tip_data,
res = 100, # Number of time steps
plot = FALSE)
# Plot contMap = stochastic mapping of continuous trait
plot_contMap(contMap = Ponerinae_contMap,
color_scale = color_scale)
## Set focal time to 10 Mya
focal_time <- 10
## Run deepSTRAPP on net diversification rates for focal time = 10 Mya.
deepSTRAPP_output <- run_deepSTRAPP_for_focal_time(
contMap = Ponerinae_contMap,
ace = Ponerinae_ACE,
tip_data = Ponerinae_cont_tip_data,
trait_data_type = "continuous",
BAMM_object = Ponerinae_BAMM_object_old_calib,
focal_time = focal_time,
rate_type = "net_diversification",
return_perm_data = TRUE,
extract_diversification_data_melted_df = TRUE,
return_updated_trait_data_with_Map = TRUE,
return_updated_BAMM_object = TRUE)
## Explore output
str(deepSTRAPP_output, max.level = 1)
# Access deepSTRAPP results
str(deepSTRAPP_output$STRAPP_results)
# Access trait data
head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data)
# Access the diversification data in a melted data.frame
head(deepSTRAPP_output$diversification_data_df)
# Plot rates vs. trait values across branches
plot_rates_vs_trait_data_for_focal_time(deepSTRAPP_output)
# Plot updated contMap
plot_contMap(deepSTRAPP_output$updated_trait_data_with_Map$contMap)
ape::nodelabels(text =
deepSTRAPP_output$updated_trait_data_with_Map$contMap$tree$initial_nodes_ID)
# Plot diversification rates on updated phylogeny
plot_BAMM_rates(deepSTRAPP_output$updated_BAMM_object, labels = TRUE)
# Plot histogram of test stats
plot_histogram_STRAPP_test_for_focal_time(
deepSTRAPP_outputs = deepSTRAPP_output)
# ----- Example 2: Categorical trait ----- #
## Load data
# Load phylogeny
data(Ponerinae_tree, package = "deepSTRAPP")
# Load trait df
data(Ponerinae_trait_tip_data, package = "deepSTRAPP")
# Load the BAMM_object summarizing 1000 posterior samples of BAMM
data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP")
## This dataset is only available in development versions installed from GitHub.
# It is not available in CRAN versions.
# Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version.
## Prepare trait data
# Extract categorical data with 3-levels
Ponerinae_cat_3lvl_tip_data <- setNames(object = Ponerinae_trait_tip_data$fake_cat_3lvl_tip_data,
nm = Ponerinae_trait_tip_data$Taxa)
table(Ponerinae_cat_3lvl_tip_data)
# Select color scheme for states
colors_per_states <- c("forestgreen", "sienna", "goldenrod")
names(colors_per_states) <- c("arboreal", "subterranean", "terricolous")
# (May take several minutes to run)
## Produce densityMaps using stochastic character mapping based on an equal-rates (ER) Mk model
Ponerinae_cat_3lvl_data_old_calib <- prepare_trait_data(
tip_data = Ponerinae_cat_3lvl_tip_data,
phylo = Ponerinae_tree_old_calib,
trait_data_type = "categorical",
colors_per_levels = colors_per_states,
evolutionary_models = "ARD", # Use default ARD model
nb_simulations = 100, # Reduce number of simulations to save time
seed = 1234, # Seet seed for reproducibility
return_best_model_fit = TRUE,
return_model_selection_df = TRUE,
plot_map = FALSE)
# Load directly output
data(Ponerinae_cat_3lvl_data_old_calib, package = "deepSTRAPP")
## Set focal time to 10 Mya
focal_time <- 10
# (May take several minutes to run)
## Run deepSTRAPP on net diversification rates for focal time = 10 Mya.
deepSTRAPP_output <- run_deepSTRAPP_for_focal_time(
densityMaps = Ponerinae_cat_data_old_calib$densityMaps,
ace = Ponerinae_cat_data_old_calib$ace,
tip_data = Ponerinae_cat_3lvl_tip_data,
trait_data_type = "categorical",
BAMM_object = Ponerinae_BAMM_object_old_calib,
focal_time = focal_time,
rate_type = "net_diversification",
posthoc_pairwise_tests = TRUE,
return_perm_data = TRUE,
extract_diversification_data_melted_df = TRUE,
return_updated_trait_data_with_Map = TRUE,
return_updated_BAMM_object = TRUE)
## Explore output
str(deepSTRAPP_output, max.level = 1)
# Access deepSTRAPP results
str(deepSTRAPP_output$STRAPP_results, max.level = 2)
# Result for overall Kruskal-Wallis test
deepSTRAPP_output$STRAPP_results[1:3]
# Results for posthoc pairwise Dunn's tests
deepSTRAPP_output$STRAPP_results$posthoc_pairwise_tests$summary_df
# Access trait data
head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data)
# Access the diversification data in a melted data.frame
head(deepSTRAPP_output$diversification_data_df)
# Plot rates vs. states across branches
plot_rates_vs_trait_data_for_focal_time(
deepSTRAPP_outputs = deepSTRAPP_output,
colors_per_levels = colors_per_states)
# Plot updated densityMaps cut at focal time
plot_densityMaps_overlay(deepSTRAPP_output$updated_trait_data_with_Map$densityMaps)
# Plot diversification rates on updated phylogeny
plot_BAMM_rates(BAMM_object = deepSTRAPP_output$updated_BAMM_object, legend = TRUE, labels = FALSE,
colorbreaks = deepSTRAPP_output$updated_BAMM_object$initial_colorbreaks$net_diversification)
# Plot histogram of Kruskal-Wallis overall test stats
plot_histogram_STRAPP_test_for_focal_time(
deepSTRAPP_outputs = deepSTRAPP_output)
# Plot histograms of posthoc pairwise Dunn's test stats
plot_histogram_STRAPP_test_for_focal_time(
deepSTRAPP_outputs = deepSTRAPP_output,
plot_posthoc_tests = TRUE)
# ----- Example 3: Biogeographic ranges ----- #
## Load data
# Load phylogeny
data(Ponerinae_tree_old_calib, package = "deepSTRAPP")
# Load trait df
data(Ponerinae_binary_range_table, package = "deepSTRAPP")
# Load the BAMM_object summarizing 1000 posterior samples of BAMM
data(Ponerinae_BAMM_object_old_calib, package = "deepSTRAPP")
## This dataset is only available in development versions installed from GitHub.
# It is not available in CRAN versions.
# Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version.
## Prepare range data for Old World vs. New World
# No overlap in ranges
table(Ponerinae_binary_range_table$Old_World, Ponerinae_binary_range_table$New_World)
Ponerinae_NO_data <- stats::setNames(object = Ponerinae_binary_range_table$Old_World,
nm = Ponerinae_binary_range_table$Taxa)
Ponerinae_NO_data <- as.character(Ponerinae_NO_data)
Ponerinae_NO_data[Ponerinae_NO_data == "TRUE"] <- "O" # O = Old World
Ponerinae_NO_data[Ponerinae_NO_data == "FALSE"] <- "N" # N = New World
names(Ponerinae_NO_data) <- Ponerinae_binary_range_table$Taxa
table(Ponerinae_NO_data)
colors_per_ranges <- c("mediumpurple2", "peachpuff2")
names(colors_per_ranges) <- c("N", "O")
# (May take several minutes to run)
## Run evolutionary models
Ponerinae_biogeo_data <- prepare_trait_data(
tip_data = Ponerinae_NO_data,
trait_data_type = "biogeographic",
phylo = Ponerinae_tree_old_calib,
evolutionary_models = "DEC+J", # Default = "DEC" for biogeographic
BioGeoBEARS_directory_path = tempdir(), # Ex: "./BioGeoBEARS_directory/"
keep_BioGeoBEARS_files = FALSE,
prefix_for_files = "Ponerinae_old_calib",
max_range_size = 2,
split_multi_area_ranges = TRUE, # Set to TRUE to display the two outputs
nb_simulations = 100, # Reduce to save time (Default = '1000')
colors_per_levels = colors_per_ranges,
return_model_selection_df = TRUE,
verbose = TRUE)
# Load directly output
data(Ponerinae_biogeo_data_old_calib, package = "deepSTRAPP")
## Explore output
str(Ponerinae_biogeo_data_old_calib, 1)
## Set focal time to 10 Mya
focal_time <- 10
# (May take several minutes to run)
## Run deepSTRAPP on net diversification rates for focal time = 10 Mya.
deepSTRAPP_output <- run_deepSTRAPP_for_focal_time(
densityMaps = Ponerinae_biogeo_data_old_calib$densityMaps,
ace = Ponerinae_biogeo_data_old_calib$ace,
tip_data = Ponerinae_NO_data,
trait_data_type = "biogeographic",
BAMM_object = Ponerinae_BAMM_object_old_calib,
focal_time = focal_time,
rate_type = "net_diversification",
return_perm_data = TRUE,
extract_diversification_data_melted_df = TRUE,
return_updated_trait_data_with_Map = TRUE,
return_updated_BAMM_object = TRUE)
## Explore output
str(deepSTRAPP_output, max.level = 1)
# Access deepSTRAPP results
str(deepSTRAPP_output$STRAPP_results, max.level = 2)
# Result for Mann-Whitney-Wilcoxon test
deepSTRAPP_output$STRAPP_results[1:3]
# Access trait data
head(deepSTRAPP_output$updated_trait_data_with_Map$trait_data)
# Access the diversification data in a melted data.frame
head(deepSTRAPP_output$diversification_data_df)
# Plot rates vs. ranges across branches
plot_rates_vs_trait_data_for_focal_time(
deepSTRAPP_outputs = deepSTRAPP_output,
colors_per_levels = colors_per_ranges)
# Plot updated densityMaps cut at focal time
plot_densityMaps_overlay(deepSTRAPP_output$updated_trait_data_with_Map$densityMaps)
# Plot diversification rates on updated phylogeny
plot_BAMM_rates(BAMM_object = deepSTRAPP_output$updated_BAMM_object, legend = TRUE, labels = FALSE,
colorbreaks = deepSTRAPP_output$updated_BAMM_object$initial_colorbreaks$net_diversification)
# Plot histogram of Mann-Whitney-Wilcoxon test
plot_histogram_STRAPP_test_for_focal_time(
STRAPP_results = deepSTRAPP_output$STRAPP_results)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.