Nothing
## Functions to plot rates through time in relation with trait values
# One master function to prepare data and select the proper test function according to data type
# Three sub-functions carrying out tests according to data type
#' @title Plot evolution of diversification rates in relation to trait values over time
#'
#' @description Plot the evolution of diversification rates in relation to trait values
#' extracted for multiple `time_steps` with [deepSTRAPP::run_deepSTRAPP_over_time()].
#'
#' Rates are averaged across branches at each time step (i.e., `focal_time`).
#' * For continuous data, branches are grouped by ranges of trait values defined by `quantile_ranges`.
#' * For categorical data, branches are grouped by trait states.
#' * For biogeographic data, branches are grouped by ranges.
#'
#' @param deepSTRAPP_outputs List of elements generated with [deepSTRAPP::run_deepSTRAPP_over_time()],
#' that summarize the results of multiple STRAPP tests across `$time_steps`. The list needs to include two data.frame:
#' `$trait_data_df_over_time` and `$diversification_data_df_over_time` by setting `extract_trait_data_melted_df = TRUE`
#' and `extract_diversification_data_melted_df = TRUE`.
#' @param rate_type A character string specifying the type of diversification rates to use.
#' Must be one of 'speciation', 'extinction' or 'net_diversification' (default).
#' Even if the `deepSTRAPP_outputs` object was generated with [deepSTRAPP::run_deepSTRAPP_over_time()]
#' for testing another type of rates, the `$trait_data_df_over_time` and `$diversification_data_df_over_time` data frames
#' will contain data for all types of rates.
#' @param quantile_ranges Vector of numerical. Only for continuous trait data. Quantiles used as thresholds to group branches
#' by trait values. It must start with 0 and finish with 1. Default is `c(0, 0.25, 0.5, 0.75, 1.0)`
#' which produces four balanced quantile groups.
#' @param select_trait_levels (Vector of) character string. Only for categorical and biogeographic trait data.
#' To provide a list of a subset of states/ranges to plot. Names must match the ones found in
#' `deepSTRAPP_outputs$trait_data_df_over_time$trait_value`. Default is `all` which means all states/ranges will be plotted.
#' @param time_range Vector of two numerical values. Time boundaries used for the plot.
#' If `NULL` (the default), the range of data provided in `deepSTRAPP_outputs` will be used.
#' @param color_scale Vector of character string. List of colors to use to build the color scale with [grDevices::colorRampPalette()]
#' to display the quantile groups used to discretize the continuous trait data. From lowest values to highest values. Only for continuous data.
#' Default = `NULL` will use the 'Spectral' color palette in [RColorBrewer::brewer.pal()].
#' @param colors_per_levels Named character string. To set the colors to use to plot rates of each state/range. Names = states/ranges; values = colors.
#' If `NULL` (default), the default ggplot2 color palette ([scales::hue_pal()]) will be used. Only for categorical and biogeographic data.
#' @param plot_CI Logical. Whether to plot a confidence interval (CI) based on the distribution of rates found in posterior samples. Default is `FALSE`.
#' @param CI_type Character string. To select the type of confidence interval (CI) to plot.
#' * `fuzzy` (default): to overlay the evolution of rates found in all posterior samples with high transparency levels.
#' * `quantiles_rect`: to add a polygon encompassing a proportion of the rate values found in posterior samples.
#' This proportion is defined with `CI_quantiles`.
#' @param CI_quantiles Numerical. Proportion of rate values across posterior samples encompassed by the confidence interval. Only if `CI_type = "quantiles_rect"`. Default is `0.95`.
#' @param display_plot Logical. Whether to display the plot generated in the R console. Default is `TRUE`.
#' @param PDF_file_path Character string. If provided, the plot will be saved in a PDF file following the path provided here. The path must end with ".pdf".
#' @param return_mean_data_per_samples_df Logical. Whether to include in the output the data.frame of mean rates per trait values computed for
#' each posterior sample at each time-step (aggregated across groups of branches based on trait data). This is used to draw the confidence interval. Default is `FALSE`.
#' @param return_median_data_across_samples_df Logical. Whether to include in the output the data.frame of median rates per trait values
#' across posterior samples computed for at each time-step (aggregated across groups of branches based on trait data AND posterior samples).
#' This is used to draw the lines on the plot. Default is `FALSE`.
#'
#' @export
#' @importFrom ggplot2 ggplot geom_line aes geom_hline geom_polygon scale_y_continuous scale_x_continuous scale_color_discrete scale_color_brewer scale_fill_brewer xlab ylab ggtitle theme element_line element_rect element_text unit margin
#' @importFrom dplyr left_join join_by group_by reframe summarise ungroup mutate arrange select filter
#' @importFrom cowplot save_plot
#' @importFrom stringr str_to_title
#' @importFrom stats quantile
#' @importFrom RColorBrewer brewer.pal
#' @importFrom scales hue_pal
#'
#' @return The function returns a list with at least one element.
#'
#' * `rates_TT_ggplot` An object of classes `gg` and `ggplot`. This is a ggplot that can be displayed
#' on the console with `print(output$rates_TT_ggplot)`. It corresponds to the plot being displayed on the console
#' when the function is run, if `display_plot = TRUE`, and can be further modify for aesthetics using the ggplot2 grammar.
#'
#' Optional summary data frames:
#' * `mean_data_per_samples_df` A data.frame with four columns providing the `$mean_rates` observed along branches
#' with a similar `$trait_value` (if categorical or biogeographic) or falling into the same `$quantile_ranges`.
#' Data are extracted for each posterior sample (`$BAMM_sample_ID`) at each time-step (i.e., `$focal_time`).
#' This is used to draw the confidence interval. Included if `return_mean_data_per_samples_df = TRUE`.
#' * `$median_data_across_samples_df` A data.frame with three columns providing the `$median_rates`
#' observed across all posterior samples in `$mean_data_per_samples_df`. This is used to draw the lines on the plot.
#' Included if `return_median_data_across_samples_df = TRUE`.
#'
#' If a `PDF_file_path` is provided, the function will also generate a PDF file of the plot.
#'
#' @author Maël Doré
#'
#' @seealso [deepSTRAPP::run_deepSTRAPP_over_time()]
#'
#' For a guided tutorial, see this vignette: \code{vignette("plot_rates_through_time", package = "deepSTRAPP")}
#'
#' @examples
#'
#' # ------ Example 1: Plot rates through time for continuous data ------ #
#'
#' if (deepSTRAPP::is_dev_version())
#' {
#' ## Load results of run_deepSTRAPP_over_time()
#' data(Ponerinae_deepSTRAPP_cont_old_calib_0_40, package = "deepSTRAPP")
#' ## This dataset is only available in development versions installed from GitHub.
#' # It is not available in CRAN versions.
#' # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version.
#'
#' # Visualize trait data
#' hist(Ponerinae_deepSTRAPP_cont_old_calib_0_40$trait_data_df_over_time$trait_value,
#' xlab = "Trait values", main = NULL)
#'
#' # Generate plot
#' plotTT_continuous <- plot_rates_through_time(
#' deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cont_old_calib_0_40,
#' quantile_ranges = c(0, 0.25, 0.5, 0.75, 1.0),
#' time_range = c(0, 50), # Control range of the X-axis
#' # color_scale = c("limegreen", "red"),
#' plot_CI = TRUE,
#' CI_type = "quantiles_rect",
#' CI_quantiles = 0.9,
#' display_plot = FALSE,
#' # PDF_file_path = "./plotTT_continuous.pdf",
#' return_mean_data_per_samples_df = TRUE,
#' return_median_data_across_samples_df = TRUE)
#'
#' # Explore output
#' # str(plotTT_continuous, max.level = 1)
#'
#' # Plot
#' print(plotTT_continuous$rates_TT_ggplot)
#' # Adjust aesthetics of plot a posteriori
#' plotTT_continuous_adj <- plotTT_continuous$rates_TT_ggplot +
#' ggplot2::theme(
#' plot.title = ggplot2::element_text(color = "red", size = 15),
#' axis.title = ggplot2::element_text(size = 14),
#' axis.text = ggplot2::element_text(size = 12))
#' # Plot again
#' print(plotTT_continuous_adj)
#' }
#'
#'
#' # ------ Example 2: Plot rates through time for categorical data ------ #
#'
#' if (deepSTRAPP::is_dev_version())
#' {
#' ## Load results of run_deepSTRAPP_over_time()
#' data(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40, package = "deepSTRAPP")
#' ## This dataset is only available in development versions installed from GitHub.
#' # It is not available in CRAN versions.
#' # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version.
#'
#' # Explore trait data
#' table(Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40$trait_data_df_over_time$trait_value)
#'
#' # Set colors to use
#' colors_per_states <- c("forestgreen", "sienna", "goldenrod")
#' names(colors_per_states) <- c("arboreal", "subterranean", "terricolous")
#'
#' # Generate plot only for "arboreal" and "terricolous"
#' plotTT_categorical <- plot_rates_through_time(
#' deepSTRAPP_outputs = Ponerinae_deepSTRAPP_cat_3lvl_old_calib_0_40,
#' select_trait_levels = c("arboreal", "terricolous"),
#' time_range = c(0, 50),
#' colors_per_levels = colors_per_states,
#' plot_CI = TRUE,
#' CI_type = "quantiles_rect",
#' CI_quantiles = 0.9,
#' display_plot = FALSE,
#' # PDF_file_path = "./plotTT_categorical.pdf",
#' return_mean_data_per_samples_df = TRUE,
#' return_median_data_across_samples_df = TRUE)
#'
#' # Explore output
#' # str(plotTT_categorical, max.level = 1)
#'
#' # Adjust aesthetics of plot a posteriori
#' plotTT_categorical_adj <- plotTT_categorical$rates_TT_ggplot +
#' ggplot2::theme(
#' plot.title = ggplot2::element_text(size = 15),
#' axis.title = ggplot2::element_text(size = 14),
#' axis.text = ggplot2::element_text(size = 12))
#' print(plotTT_categorical_adj)
#' }
#'
#' # ------ Example 3: Plot rates through time for biogeographic data ------ #
#'
#' if (deepSTRAPP::is_dev_version())
#' {
#' ## Load results of run_deepSTRAPP_over_time()
#' data(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40, package = "deepSTRAPP")
#' ## This dataset is only available in development versions installed from GitHub.
#' # It is not available in CRAN versions.
#' # Use remotes::install_github(repo = "MaelDore/deepSTRAPP") to get the latest development version.
#'
#' # Explore range data
#' table(Ponerinae_deepSTRAPP_biogeo_old_calib_0_40$trait_data_df_over_time$trait_value)
#'
#' # Set colors to use
#' colors_per_ranges <- c("mediumpurple2", "peachpuff2")
#' names(colors_per_ranges) <- c("N", "O")
#'
#' plotTT_biogeographic <- plot_rates_through_time(
#' deepSTRAPP_outputs = Ponerinae_deepSTRAPP_biogeo_old_calib_0_40,
#' select_trait_levels = "all",
#' time_range = c(0, 50),
#' colors_per_levels = colors_per_ranges,
#' plot_CI = TRUE,
#' CI_type = "quantiles_rect",
#' CI_quantiles = 0.9,
#' display_plot = FALSE,
#' # PDF_file_path = "./plotTT_biogeographic.pdf",
#' return_mean_data_per_samples_df = TRUE,
#' return_median_data_across_samples_df = TRUE)
#'
#' # Explore output
#' # str(plotTT_biogeographic, max.level = 1)
#'
#' # Adjust aesthetics of plot a posteriori
#' plotTT_biogeographic_adj <- plotTT_biogeographic$rates_TT_ggplot +
#' ggplot2::theme(
#' plot.title = ggplot2::element_text(size = 15),
#' axis.title = ggplot2::element_text(size = 14),
#' axis.text = ggplot2::element_text(size = 12))
#' print(plotTT_biogeographic_adj)
#' }
#'
### Master function to prepare data and select the proper test function according to data type ####
plot_rates_through_time <- function (
deepSTRAPP_outputs,
rate_type = "net_diversification",
quantile_ranges = c(0, 0.25, 0.5, 0.75, 1.0),
select_trait_levels = "all",
time_range = NULL,
color_scale = NULL,
colors_per_levels = NULL,
plot_CI = FALSE,
CI_type = "fuzzy",
CI_quantiles = 0.95,
display_plot = TRUE,
PDF_file_path = NULL,
return_mean_data_per_samples_df = FALSE,
return_median_data_across_samples_df = FALSE
)
{
### Check input validity
{
## deepSTRAPP_outputs
# deepSTRAPP_outputs must have element $trait_data_df_over_time
if (is.null(deepSTRAPP_outputs$trait_data_df_over_time))
{
stop(paste0("'$trait_data_df_over_time' is missing from 'deepSTRAPP_outputs'. You can inspect the structure of the input object with 'str(deepSTRAPP_outputs, 2)'.\n",
"See ?deepSTRAPP::run_deepSTRAPP_over_time() to learn how to generate those objects.\n",
"Especially, check if you used 'extract_trait_data_melted_df = TRUE' to save the summary data.frame of trait values ",
"found along branches at each time-step, needed for the RTT plot."))
}
# deepSTRAPP_outputs must have element $diversification_data_df_over_time
if (is.null(deepSTRAPP_outputs$diversification_data_df_over_time))
{
stop(paste0("'$diversification_data_df_over_time' is missing from 'deepSTRAPP_outputs'. You can inspect the structure of the input object with 'str(deepSTRAPP_outputs, 2)'.\n",
"See ?deepSTRAPP::run_deepSTRAPP_over_time() to learn how to generate those objects.\n",
"Especially, check if you used 'extract_diversification_data_melted_df = TRUE' to save the summary data.frame of diversification rates ",
"found along branches at each time-step, needed for the RTT plot."))
}
## rate_type must be either "speciation", "extinction" or "net_diversification"
if (!(rate_type %in% c("speciation", "extinction", "net_diversification")))
{
stop("'rate_type' can only be 'speciation', 'extinction', or 'net_diversification'.")
}
## time_range
if (!is.null(time_range))
{
# Check that two values are provided for time_range
if (length(time_range) != 2)
{
stop(paste0("'time_range' must be a vector of two positive numerical values providing the time boundaries used for the plot."))
}
# Check that time_range is strictly positive
if (!identical(time_range, abs(time_range)))
{
stop(paste0("'time_range' must be strictly positive numerical values providing the time boundaries used for the plot."))
}
# Ensure that time_range are properly ordered in increasing values
time_range <- range(time_range)
# Check that time_range encompass multiple focal times to be able to draw a line
focal_times_in_trait_df <- unique(deepSTRAPP_outputs$trait_data_df_over_time$focal_time)
focal_times_in_diversification_df <- unique(deepSTRAPP_outputs$diversification_data_df_over_time$focal_time)
shared_focal_times <- intersect(focal_times_in_trait_df, focal_times_in_diversification_df)
shared_focal_times_in_range <- (shared_focal_times >= time_range[1]) & (shared_focal_times <= time_range[2])
if (sum(shared_focal_times_in_range) < 2)
{
stop(paste0("'time_range' must encompass at least two focal_time recorded in the summary data.frames ",
"for trait data ('deepSTRAPP_outputs$trait_data_df_over_time') and diversification rates ('deepSTRAPP_outputs$diversification_data_df_over_time').\n",
"Current values of 'time_range' = ", paste(time_range, collapse = ", "), ".\n",
"'focal_time' with data recorded for traits and rates are: ", paste(shared_focal_times, collapse = ", "),"."))
}
}
## CI_type
# CI_type is either "fuzzy" or "quantiles_rect"
if (!(CI_type %in% c("fuzzy", "quantiles_rect")))
{
stop("'CI_type' can only be 'fuzzy', or 'quantiles_rect'.")
}
## CI_quantiles
# CI_quantiles should be a numerical between 0 and 1
if ((CI_quantiles < 0) | (CI_quantiles > 1))
{
stop(paste0("'CI_quantiles' reflects the proportion of rate values encompass by the confidence interval.\n",
"This is used to display the variance in rates observed across posterior samples. It must be between 0 and 1.\n",
"Current value of 'CI_quantiles' is ",CI_quantiles,"."))
}
## PDF_file_path
# If provided, PDF_file_path must end with ".pdf"
if (!is.null(PDF_file_path))
{
if (length(grep(pattern = "\\.pdf$", x = PDF_file_path)) != 1)
{
stop("'PDF_file_path' must end with '.pdf'")
}
}
## Other checks are carried in dedicated sub-functions
}
## Save initial par() and reassign them on exit
oldpar <- par(no.readonly = TRUE)
on.exit(par(oldpar))
## Detect the type of trait data
trait_data_type <- deepSTRAPP_outputs$trait_data_type
## Compute the appropriate internal function depending on the type of data
switch(EXPR = trait_data_type,
continuous = { # Case for continuous data
# Need quantile_ranges to define groups of branches per trait values
plotTT_output <- plot_rates_through_time_for_continuous_data(
deepSTRAPP_outputs = deepSTRAPP_outputs,
rate_type = rate_type,
quantile_ranges = quantile_ranges,
time_range = time_range,
color_scale = color_scale,
plot_CI = plot_CI,
CI_type = CI_type,
CI_quantiles = CI_quantiles,
display_plot = display_plot,
PDF_file_path = PDF_file_path,
return_mean_data_per_samples_df = return_mean_data_per_samples_df,
return_median_data_across_samples_df = return_median_data_across_samples_df
)
},
categorical = { # Case for categorical data
# Can select the states to plot
plotTT_output <- plot_rates_through_time_for_categorical_data(
deepSTRAPP_outputs = deepSTRAPP_outputs,
rate_type = rate_type,
select_trait_levels = select_trait_levels,
time_range = time_range,
colors_per_levels = colors_per_levels,
plot_CI = plot_CI,
CI_type = CI_type,
CI_quantiles = CI_quantiles,
display_plot = display_plot,
PDF_file_path = PDF_file_path,
return_mean_data_per_samples_df = return_mean_data_per_samples_df,
return_median_data_across_samples_df = return_median_data_across_samples_df
)
},
biogeographic = { # Case for biogeographic data
# Can select the states/ranges to plot
plotTT_output <- plot_rates_through_time_for_biogeographic_data(
deepSTRAPP_outputs = deepSTRAPP_outputs,
rate_type = rate_type,
select_trait_levels = select_trait_levels,
time_range = time_range,
colors_per_levels = colors_per_levels,
plot_CI = plot_CI,
CI_type = CI_type,
CI_quantiles = CI_quantiles,
display_plot = display_plot,
PDF_file_path = PDF_file_path,
return_mean_data_per_samples_df = return_mean_data_per_samples_df,
return_median_data_across_samples_df = return_median_data_across_samples_df
)
}
)
## Export the output
return(invisible(plotTT_output))
}
### Sub-function to handle continuous data ####
plot_rates_through_time_for_continuous_data <- function (
deepSTRAPP_outputs,
rate_type = "net_diversification",
quantile_ranges = c(0, 0.25, 0.5, 0.75, 1.0),
time_range = NULL,
color_scale = NULL,
plot_CI = FALSE,
CI_type = "fuzzy",
CI_quantiles = 0.95,
display_plot = TRUE,
PDF_file_path = NULL,
return_mean_data_per_samples_df = FALSE,
return_median_data_across_samples_df = FALSE
)
{
### Check input validity
{
## quantile_ranges
# Check that quantile_ranges are between 0 and 1
if (any(quantile_ranges < 0))
{
stop(paste0("'quantile_ranges' must be numerical values between [0, 1] providing trait data quantiles to use to aggregate rate values.\n",
"Current values are ", paste(quantile_ranges, collapse = ", ")),".")
}
if (any(quantile_ranges > 1))
{
stop(paste0("'quantile_ranges' must be numerical values between [0, 1] providing trait data quantiles to use to aggregate rate values.\n",
"Current values are ", paste(quantile_ranges, collapse = ", ")),".")
}
# Ensure that quantile_ranges are properly ordered in increasing values
quantile_ranges <- quantile_ranges[order(quantile_ranges)]
# Ensure that quantile_ranges include the c(0,1) boundaries
initial_quantile_ranges <- quantile_ranges
if (range(quantile_ranges)[1] != 0)
{
cat(paste0("WARNING: 'quantile_ranges' does not start with 0. The lower boundary was added.\n"))
quantile_ranges <- c(0, quantile_ranges)
}
if (range(quantile_ranges)[2] != 1)
{
cat(paste0("WARNING: 'quantile_ranges' does not end with 1. The upper boundary was added.\n"))
quantile_ranges <- c(quantile_ranges, 1)
}
if (!(identical(range(initial_quantile_ranges), c(0, 1))))
{
cat(paste0("WARNING: New 'quantile_ranges' are ", paste(quantile_ranges, collapse = ", "),".\n"))
}
## color_scale
# Check whether all colors are valid
if (!is.null(color_scale))
{
if (!all(is_color(color_scale)))
{
invalid_colors <- color_scale[!is_color(color_scale)]
stop(paste0("Some color names in 'color_scale' are not valid.\n",
"Invalid: ", paste(invalid_colors, collapse = ", "), "."))
}
}
}
## Adjust rate_type for labels
rate_type_label <- stringr::str_to_title(rate_type)
rate_type_label <- gsub(pattern = "_", replacement = " ", x = rate_type_label)
## Create binding of new variables to avoid Notes
tip_ID <- BAMM_sample_ID <- focal_time <- quant_traits <- NULL
trait_value <- rates <- median_rates <- mean_rates <- NULL
n_points <- points_ID <- quant_rates <- NULL
## Merge diversification and trait data
# Trait data are copied across BAMM samples
data_per_samples_df <- dplyr::left_join(
x = deepSTRAPP_outputs$diversification_data_df_over_time,
y = deepSTRAPP_outputs$trait_data_df_over_time,
by = dplyr::join_by(focal_time, tip_ID))
## Filter data for selected rate_type
if (rate_type == "speciation") { rate_type <- "lambda" }
if (rate_type == "extinction") { rate_type <- "mu" }
data_per_samples_df <- data_per_samples_df[data_per_samples_df$rate_type == rate_type, ]
# Filter data for the selected time range
if (!is.null(time_range))
{
data_per_samples_df <- data_per_samples_df[data_per_samples_df$focal_time <= time_range[2], ]
data_per_samples_df <- data_per_samples_df[data_per_samples_df$focal_time >= time_range[1], ]
} else {
# Extract time range from data
time_range <- range(data_per_samples_df$focal_time)
}
if (nrow(data_per_samples_df) == 0)
{
stop("No data found in the time range c(",time_range[1],", ", time_range[2],").\n")
}
## Compute quantile thresholds for each focal_time
quantiles_data_df <- data_per_samples_df |>
dplyr::group_by(focal_time) |>
# Compute quantiles of trait value
dplyr::reframe(quant_traits = stats::quantile(trait_value, probs = quantile_ranges, na.rm = T))
## Attribute a quantile range to trait values found across branches for each focal_time
# (will be copy across BAMM samples as trait value do not change across BAMM samples)
data_per_samples_df$quantile_ranges <- NA
# Loop per focal_time
focal_time_list <- unique(data_per_samples_df$focal_time)
for (i in seq_along(focal_time_list))
{
# i <- 1
# Extract focal_time
focal_time_i <- focal_time_list[i]
# Get thresholds for this focal_time
quantile_thresholds <- quantiles_data_df$quant_traits[quantiles_data_df$focal_time == focal_time_i]
# Loop per quantile ranges
# From highest to lowest
for (j in length(quantile_ranges):2)
{
# j <- 2
# Get quantile range name
quantile_range_name <- paste0("Q",quantile_ranges[j-1]*100,"% - Q",quantile_ranges[j]*100,"%")
# Get max threshold for this quantile range
threshold_j <- quantile_thresholds[j]
# Inform $quantile_ranges by attributing all branches with rates <= than the max threshold
data_per_samples_df$quantile_ranges[(data_per_samples_df$focal_time == focal_time_i) & (data_per_samples_df$trait_value <= threshold_j)] <- quantile_range_name
}
}
# table(data_per_samples_df$quantile_ranges) # Should be roughly equally distributed
## Aggregate across tip_ID (branches), per quantile ranges
mean_data_per_samples_df <- data_per_samples_df |>
dplyr::group_by(focal_time, BAMM_sample_ID, quantile_ranges) |>
dplyr::summarise(mean_rates = mean(rates), .groups = "keep") |>
dplyr::ungroup()
## Aggregate across BAMM samples
median_data_across_samples_df <- mean_data_per_samples_df |>
dplyr::group_by(focal_time, quantile_ranges) |>
dplyr::summarise(median_rates = median(mean_rates), .groups = "keep") |>
dplyr::ungroup()
## Prepare colors to use for quantile groups
nb_groups <- length(levels(as.factor(median_data_across_samples_df$quantile_ranges)))
if (!is.null(color_scale))
{
# Use the provided color to build the color palette
col_fn <- grDevices::colorRampPalette(colors = color_scale)
colors_per_groups <- col_fn(n = nb_groups)
} else {
# Default: use the 'Spectral' palette from RColorBrewer
colors_per_groups <- rev(RColorBrewer::brewer.pal(name = "Spectral", n = nb_groups))
}
names(colors_per_groups) <- levels(as.factor(median_data_across_samples_df$quantile_ranges))
## Case for plot without CI
if (!plot_CI)
{
rates_TT_ggplot <- ggplot2::ggplot(data = median_data_across_samples_df) +
# Plot mean lines
ggplot2::geom_line(mapping = aes(y = median_rates, x = focal_time,
group = quantile_ranges, col = quantile_ranges),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per trait values through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust color scheme and legend
# ggplot2::scale_color_brewer(name = "Trait quantile groups", palette = "Spectral", direction = -1) +
ggplot2::scale_color_manual(name = "Trait quantile groups", values = colors_per_groups) +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
} else { ## Case for plot with CI
if (CI_type == "fuzzy")
{
## Plot with fuzzy CI
rates_TT_ggplot <- ggplot2::ggplot(data = mean_data_per_samples_df) +
# Plot line replicates for all samples
ggplot2::geom_line(data = mean_data_per_samples_df,
mapping = aes(y = mean_rates, x = focal_time,
group = interaction(quantile_ranges, BAMM_sample_ID),
col = quantile_ranges),
alpha = 0.01,
linewidth = 3.0) +
# Plot mean lines
ggplot2::geom_line(data = median_data_across_samples_df,
mapping = aes(y = median_rates, x = focal_time,
group = quantile_ranges, col = quantile_ranges),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per trait values through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust color scheme and legend
# ggplot2::scale_color_brewer(name = "Trait quantile groups", palette = "Spectral", direction = -1) +
ggplot2::scale_color_manual(name = "Trait quantile groups", values = colors_per_groups) +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
} else {
## Plot with quantiles_rect CI
## Convert CI quantiles to the proportion of data to NOT include
CI_quantiles_inv <- (1 - CI_quantiles)
## Create data.frame for quantile polygons
quantiles_mean_data_df <- mean_data_per_samples_df |>
dplyr::group_by(focal_time, quantile_ranges) |>
# Compute quantiles
dplyr::reframe(quant_rates = stats::quantile(mean_rates, probs = c(CI_quantiles_inv/2, (1 - CI_quantiles_inv/2)), na.rm = T)) |>
dplyr::group_by(focal_time, quantile_ranges) |>
dplyr::mutate(quantile = c(CI_quantiles_inv/2, (1 - CI_quantiles_inv/2))) |>
# Assign points ID (order for drawing the polygon)
dplyr::group_by(quantile_ranges) |>
dplyr::arrange(quantile_ranges, quantile) |>
dplyr::mutate(n_points = dplyr::n()) |> # Count the number of points in a polygon
dplyr::mutate(points_ID = c(1:(dplyr::first(n_points)/2), dplyr::first(n_points):((dplyr::first(n_points)/2) + 1))) |>
dplyr::select(-n_points) |>
# Reorder by points ID
dplyr::arrange(quantile_ranges, points_ID) |>
# Filter for NA
dplyr::filter(!is.na(quant_rates)) |>
# Reattribute points_ID after filtering
dplyr::mutate(points_ID = dplyr::row_number()) |>
dplyr::ungroup()
rates_TT_ggplot <- ggplot2::ggplot(data = quantiles_mean_data_df) +
# Plot quantile polygons
ggplot2::geom_polygon(data = quantiles_mean_data_df,
mapping = aes(y = quant_rates, x = focal_time,
group = quantile_ranges,
fill = quantile_ranges),
alpha = 0.3,
linewidth = 1.0) +
# Plot mean lines
ggplot2::geom_line(data = median_data_across_samples_df,
mapping = aes(y = median_rates, x = focal_time,
group = quantile_ranges, col = quantile_ranges),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per trait values through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust fill scheme and legend
# ggplot2::scale_fill_brewer(name = "Trait quantile groups", palette = "Spectral", direction = -1) +
ggplot2::scale_fill_manual(name = "Trait quantile groups", values = colors_per_groups) +
# Adjust color scheme and legend
# ggplot2::scale_color_brewer(name = "Trait quantile groups", palette = "Spectral", direction = -1) +
ggplot2::scale_color_manual(name = "Trait quantile groups", values = colors_per_groups) +
# Remove fill legend
ggplot2::guides(fill = "none") +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
}
}
## Display plot if requested
if (display_plot)
{
print(rates_TT_ggplot)
}
## Export plot if requested
if (!is.null(PDF_file_path))
{
cowplot::save_plot(plot = rates_TT_ggplot,
filename = PDF_file_path,
base_height = 8, base_width = 14)
}
## Build output
output <- list()
## Store ggplot
output$rates_TT_ggplot <- rates_TT_ggplot
## Store melted df if requested
if (return_mean_data_per_samples_df)
{
output$mean_data_per_samples_df <- as.data.frame(mean_data_per_samples_df)
}
if (return_median_data_across_samples_df)
{
output$median_data_across_samples_df <- as.data.frame(median_data_across_samples_df)
}
## Return output
return(invisible(output))
}
### Sub-function to handle categorical data ####
plot_rates_through_time_for_categorical_data <- function (
deepSTRAPP_outputs,
rate_type = "net_diversification",
select_trait_levels = "all",
time_range = NULL,
colors_per_levels = NULL,
plot_CI = FALSE,
CI_type = "fuzzy",
CI_quantiles = 0.95,
display_plot = TRUE,
PDF_file_path = NULL,
return_mean_data_per_samples_df = FALSE,
return_median_data_across_samples_df = FALSE
)
{
### Check input validity
{
## Extract state levels
states_in_trait_df <- unique(deepSTRAPP_outputs$trait_data_df_over_time$trait_value)
states_in_trait_df <- states_in_trait_df[order(states_in_trait_df)]
## select_trait_levels
if (!any(select_trait_levels == "all"))
{
# Check that select_trait_levels are all found in the summary data.frame $trait_data_df_over_time
if (!all(select_trait_levels %in% states_in_trait_df))
{
stop(paste0("Some states listed in 'select_trait_levels' are not found in the summary data.frame for trait data ('deepSTRAPP_outputs$trait_data_df_over_time').\n",
"'select_trait_levels' = ",paste(select_trait_levels[order(select_trait_levels)], collapse = ", "),".\n",
"Observed states in trait data = ", paste(states_in_trait_df, collapse = ", ")),".")
}
}
# Update list of states to keep only the selected ones
if (!any(select_trait_levels == "all"))
{
states_in_trait_df <- select_trait_levels
}
## colors_per_levels
# Check whether all colors are valid
if (!is.null(colors_per_levels))
{
# Check that the color match the selected states
if (!all(states_in_trait_df %in% names(colors_per_levels)))
{
missing_states <- states_in_trait_df[!(states_in_trait_df %in% names(colors_per_levels))]
stop(paste0("Not all selected states are found in 'colors_per_levels'.\n",
"Missing states: ", paste(missing_states, collapse = ", "), "."))
}
if (!all(is_color(colors_per_levels)))
{
invalid_colors <- colors_per_levels[!is_color(colors_per_levels)]
stop(paste0("Some color names in 'colors_per_levels' are not valid.\n",
"Invalid: ", paste(invalid_colors, collapse = ", "), "."))
}
}
}
## Create binding of new variables to avoid Notes
tip_ID <- BAMM_sample_ID <- focal_time <- quant_traits <- NULL
trait_value <- rates <- median_rates <- mean_rates <- NULL
n_points <- points_ID <- quant_rates <- NULL
## Adjust rate_type for labels
rate_type_label <- stringr::str_to_title(rate_type)
rate_type_label <- gsub(pattern = "_", replacement = " ", x = rate_type_label)
## Merge diversification and trait data
# Trait data are copied across BAMM samples
data_per_samples_df <- dplyr::left_join(
x = deepSTRAPP_outputs$diversification_data_df_over_time,
y = deepSTRAPP_outputs$trait_data_df_over_time,
by = dplyr::join_by(focal_time, tip_ID))
## Filter data for selected rate_type
if (rate_type == "speciation") { rate_type <- "lambda" }
if (rate_type == "extinction") { rate_type <- "mu" }
data_per_samples_df <- data_per_samples_df[data_per_samples_df$rate_type == rate_type, ]
## Filter data for selected states
if (!("all" %in% select_trait_levels))
{
data_per_samples_df <- data_per_samples_df[data_per_samples_df$trait_value %in% select_trait_levels, ]
}
# Filter data for the selected time range
if (!is.null(time_range))
{
data_per_samples_df <- data_per_samples_df[data_per_samples_df$focal_time <= time_range[2], ]
data_per_samples_df <- data_per_samples_df[data_per_samples_df$focal_time >= time_range[1], ]
} else {
# Extract time range from data
time_range <- range(data_per_samples_df$focal_time)
}
if (nrow(data_per_samples_df) == 0)
{
stop("No data found in the time range c(",time_range[1],", ", time_range[2],") for ",paste(select_trait_levels, collapse = ", ")," states.\n")
}
## Aggregate across tip_ID (branches), per trait states
mean_data_per_samples_df <- data_per_samples_df |>
dplyr::group_by(focal_time, BAMM_sample_ID, trait_value) |>
dplyr::summarise(mean_rates = mean(rates), .groups = "keep") |>
dplyr::ungroup()
## Aggregate across BAMM samples
median_data_across_samples_df <- mean_data_per_samples_df |>
dplyr::group_by(focal_time, trait_value) |>
dplyr::summarise(median_rates = median(mean_rates), .groups = "keep") |>
dplyr::ungroup()
## Prepare colors_per_levels to use in plots
if (is.null(colors_per_levels))
{
nb_groups <- length(levels(as.factor(median_data_across_samples_df$trait_value)))
# Default: use the default ggplot palette from scales
col_fn <- scales::hue_pal()
colors_per_levels <- col_fn(n = nb_groups)
names(colors_per_levels) <- levels(as.factor(median_data_across_samples_df$trait_value))
}
## Case for plot without CI
if (!plot_CI)
{
rates_TT_ggplot <- ggplot2::ggplot(data = median_data_across_samples_df) +
# Plot mean lines
ggplot2::geom_line(mapping = aes(y = median_rates, x = focal_time,
group = trait_value, col = trait_value),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per trait states through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust color scheme and legend
# ggplot2::scale_color_discrete(name = "States") +
ggplot2::scale_color_manual(name = "States", values = colors_per_levels) +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
} else { ## Case for plot with CI
if (CI_type == "fuzzy")
{
## Plot with fuzzy CI
rates_TT_ggplot <- ggplot2::ggplot(data = mean_data_per_samples_df) +
# Plot line replicates for all samples
ggplot2::geom_line(data = mean_data_per_samples_df,
mapping = aes(y = mean_rates, x = focal_time,
group = interaction(trait_value, BAMM_sample_ID),
col = trait_value),
alpha = 0.01,
linewidth = 3.0) +
# Plot mean lines
ggplot2::geom_line(data = median_data_across_samples_df,
mapping = aes(y = median_rates, x = focal_time,
group = trait_value, col = trait_value),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per trait states through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust color scheme and legend
# ggplot2::scale_color_discrete(name = "States") +
ggplot2::scale_color_manual(name = "States", values = colors_per_levels) +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
} else {
## Plot with quantiles_rect CI
## Convert CI quantiles to the proportion of data to not include
CI_quantiles_inv <- (1 - CI_quantiles)
## Create data.frame for quantile polygons
quantiles_mean_data_df <- mean_data_per_samples_df |>
dplyr::group_by(focal_time, trait_value) |>
# Compute quantiles
dplyr::reframe(quant_rates = stats::quantile(mean_rates, probs = c(CI_quantiles_inv/2, (1 - CI_quantiles_inv/2)), na.rm = T)) |>
dplyr::group_by(focal_time, trait_value) |>
dplyr::mutate(quantile = c(CI_quantiles_inv/2, (1 - CI_quantiles_inv/2))) |>
# Assign points ID (order for drawing the polygon)
dplyr::group_by(trait_value) |>
dplyr::arrange(trait_value, quantile) |>
dplyr::mutate(n_points = dplyr::n()) |> # Count the number of points in a polygon
dplyr::mutate(points_ID = c(1:(dplyr::first(n_points)/2), dplyr::first(n_points):((dplyr::first(n_points)/2) + 1))) |>
dplyr::select(-n_points) |>
# Reorder by points ID
dplyr::arrange(trait_value, points_ID) |>
# Filter for NA
dplyr::filter(!is.na(quant_rates)) |>
# Reattribute points_ID after filtering
dplyr::mutate(points_ID = dplyr::row_number()) |>
dplyr::ungroup()
rates_TT_ggplot <- ggplot2::ggplot(data = quantiles_mean_data_df) +
# Plot quantile polygons
ggplot2::geom_polygon(data = quantiles_mean_data_df,
mapping = aes(y = quant_rates, x = focal_time,
group = trait_value,
fill = trait_value),
alpha = 0.3,
linewidth = 1.0) +
# Plot mean lines
ggplot2::geom_line(data = median_data_across_samples_df,
mapping = aes(y = median_rates, x = focal_time,
group = trait_value, col = trait_value),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per trait states through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust fill scheme and legend
# ggplot2::scale_fill_discrete("States") +
ggplot2::scale_fill_manual(name = "States", values = colors_per_levels) +
# Adjust color scheme and legend
# ggplot2::scale_color_discrete(name = "States") +
ggplot2::scale_color_manual(name = "States", values = colors_per_levels) +
# Remove fill legend
ggplot2::guides(fill = "none") +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
}
}
## Display plot if requested
if (display_plot)
{
print(rates_TT_ggplot)
}
## Export plot if requested
if (!is.null(PDF_file_path))
{
cowplot::save_plot(plot = rates_TT_ggplot,
filename = PDF_file_path,
base_height = 8, base_width = 14)
}
## Build output
output <- list()
## Store ggplot
output$rates_TT_ggplot <- rates_TT_ggplot
## Store melted df if requested
if (return_mean_data_per_samples_df)
{
output$mean_data_per_samples_df <- as.data.frame(mean_data_per_samples_df)
}
if (return_median_data_across_samples_df)
{
output$median_data_across_samples_df <- as.data.frame(median_data_across_samples_df)
}
## Return output
return(invisible(output))
}
### Sub-function to handle biogeographic data ####
plot_rates_through_time_for_biogeographic_data <- function (
deepSTRAPP_outputs,
rate_type = "net_diversification",
select_trait_levels = "all",
time_range = NULL,
colors_per_levels = NULL,
plot_CI = FALSE,
CI_type = "fuzzy",
CI_quantiles = 0.95,
display_plot = TRUE,
PDF_file_path = NULL,
return_mean_data_per_samples_df = FALSE,
return_median_data_across_samples_df = FALSE
)
{
### Check input validity
{
## Extract range levels
ranges_in_trait_df <- unique(deepSTRAPP_outputs$trait_data_df_over_time$trait_value)
ranges_in_trait_df <- ranges_in_trait_df[order(ranges_in_trait_df)]
## select_trait_levels
if (!any(select_trait_levels == "all"))
{
# Check that select_trait_levels are all found in the summary data.frame $trait_data_df_over_time
if (!all(select_trait_levels %in% ranges_in_trait_df))
{
stop(paste0("Some ranges listed in 'select_trait_levels' are not found in the summary data.frame for trait data ('deepSTRAPP_outputs$trait_data_df_over_time').\n",
"'select_trait_levels' = ",paste(select_trait_levels[order(select_trait_levels)], collapse = ", "),".\n",
"Observed ranges in trait data = ", paste(ranges_in_trait_df, collapse = ", ")),".")
}
}
# Update list of ranges to keep only the selected ones
if (!any(select_trait_levels == "all"))
{
ranges_in_trait_df <- select_trait_levels
}
## colors_per_levels
# Check whether all colors are valid
if (!is.null(colors_per_levels))
{
# Check that the color match the states
if (!all(ranges_in_trait_df %in% names(colors_per_levels)))
{
missing_ranges <- ranges_in_trait_df[!(ranges_in_trait_df %in% names(colors_per_levels))]
stop(paste0("Not all selected ranges are found in 'colors_per_levels'.\n",
"Missing ranges: ", paste(missing_ranges, collapse = ", "), "."))
}
if (!all(is_color(colors_per_levels)))
{
invalid_colors <- colors_per_levels[!is_color(colors_per_levels)]
stop(paste0("Some color names in 'colors_per_levels' are not valid.\n",
"Invalid: ", paste(invalid_colors, collapse = ", "), "."))
}
}
}
## Create binding of new variables to avoid Notes
tip_ID <- BAMM_sample_ID <- focal_time <- quant_traits <- NULL
trait_value <- rates <- median_rates <- mean_rates <- NULL
n_points <- points_ID <- quant_rates <- NULL
## Adjust rate_type for labels
rate_type_label <- stringr::str_to_title(rate_type)
rate_type_label <- gsub(pattern = "_", replacement = " ", x = rate_type_label)
## Merge diversification and trait data
# Trait data are copied across BAMM samples
data_per_samples_df <- dplyr::left_join(
x = deepSTRAPP_outputs$diversification_data_df_over_time,
y = deepSTRAPP_outputs$trait_data_df_over_time,
by = dplyr::join_by(focal_time, tip_ID))
## Filter data for selected rate_type
if (rate_type == "speciation") { rate_type <- "lambda" }
if (rate_type == "extinction") { rate_type <- "mu" }
data_per_samples_df <- data_per_samples_df[data_per_samples_df$rate_type == rate_type, ]
## Filter data for selected states/ranges
if (!("all" %in% select_trait_levels))
{
data_per_samples_df <- data_per_samples_df[data_per_samples_df$trait_value %in% select_trait_levels, ]
}
# Filter data for the selected time range
if (!is.null(time_range))
{
data_per_samples_df <- data_per_samples_df[data_per_samples_df$focal_time <= time_range[2], ]
data_per_samples_df <- data_per_samples_df[data_per_samples_df$focal_time >= time_range[1], ]
} else {
# Extract time range from data
time_range <- range(data_per_samples_df$focal_time)
}
if (nrow(data_per_samples_df) == 0)
{
stop("No data found in the time range c(",time_range[1],", ", time_range[2],") in ",paste(select_trait_levels, collapse = ", ")," ranges.\n")
}
## Aggregate across tip_ID (branches), per trait ranges
mean_data_per_samples_df <- data_per_samples_df |>
dplyr::group_by(focal_time, BAMM_sample_ID, trait_value) |>
dplyr::summarise(mean_rates = mean(rates), .groups = "keep") |>
dplyr::ungroup()
## Aggregate across BAMM samples
median_data_across_samples_df <- mean_data_per_samples_df |>
dplyr::group_by(focal_time, trait_value) |>
dplyr::summarise(median_rates = median(mean_rates), .groups = "keep") |>
dplyr::ungroup()
## Prepare colors_per_levels to use in plots
if (is.null(colors_per_levels))
{
nb_groups <- length(levels(as.factor(median_data_across_samples_df$trait_value)))
# Default: use the default ggplot palette from scales
col_fn <- scales::hue_pal()
colors_per_levels <- col_fn(n = nb_groups)
names(colors_per_levels) <- levels(as.factor(median_data_across_samples_df$trait_value))
}
## Case for plot without CI
if (!plot_CI)
{
rates_TT_ggplot <- ggplot2::ggplot(data = median_data_across_samples_df) +
# Plot mean lines
ggplot2::geom_line(mapping = aes(y = median_rates, x = focal_time,
group = trait_value, col = trait_value),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per ranges through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust color scheme and legend
# ggplot2::scale_color_discrete(name = "Ranges") +
ggplot2::scale_color_manual(name = "Ranges", values = colors_per_levels) +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
} else { ## Case for plot with CI
if (CI_type == "fuzzy")
{
## Plot with fuzzy CI
rates_TT_ggplot <- ggplot2::ggplot(data = mean_data_per_samples_df) +
# Plot line replicates for all samples
ggplot2::geom_line(data = mean_data_per_samples_df,
mapping = aes(y = mean_rates, x = focal_time,
group = interaction(trait_value, BAMM_sample_ID),
col = trait_value),
alpha = 0.01,
linewidth = 3.0) +
# Plot mean lines
ggplot2::geom_line(data = median_data_across_samples_df,
mapping = aes(y = median_rates, x = focal_time,
group = trait_value, col = trait_value),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per ranges through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust color scheme and legend
# ggplot2::scale_color_discrete(name = "Ranges") +
ggplot2::scale_color_manual(name = "Ranges", values = colors_per_levels) +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
} else {
## Plot with quantiles_rect CI
## Convert CI quantiles to the proportion of data to not include
CI_quantiles_inv <- (1 - CI_quantiles)
## Create data.frame for quantile polygons
quantiles_mean_data_df <- mean_data_per_samples_df |>
dplyr::group_by(focal_time, trait_value) |>
# Compute quantiles
dplyr::reframe(quant_rates = stats::quantile(mean_rates, probs = c(CI_quantiles_inv/2, (1 - CI_quantiles_inv/2)), na.rm = T)) |>
dplyr::group_by(focal_time, trait_value) |>
dplyr::mutate(quantile = c(CI_quantiles_inv/2, (1 - CI_quantiles_inv/2))) |>
# Assign points ID (order for drawing the polygon)
dplyr::group_by(trait_value) |>
dplyr::arrange(trait_value, quantile) |>
dplyr::mutate(n_points = dplyr::n()) |> # Count the number of points in a polygon
dplyr::mutate(points_ID = c(1:(dplyr::first(n_points)/2), dplyr::first(n_points):((dplyr::first(n_points)/2) + 1))) |>
dplyr::select(-n_points) |>
# Reorder by points ID
dplyr::arrange(trait_value, points_ID) |>
# Filter for NA
dplyr::filter(!is.na(quant_rates)) |>
# Reattribute points_ID after filtering
dplyr::mutate(points_ID = dplyr::row_number()) |>
dplyr::ungroup()
rates_TT_ggplot <- ggplot2::ggplot(data = quantiles_mean_data_df) +
# Plot quantile polygons
ggplot2::geom_polygon(data = quantiles_mean_data_df,
mapping = aes(y = quant_rates, x = focal_time,
group = trait_value,
fill = trait_value),
alpha = 0.3,
linewidth = 1.0) +
# Plot mean lines
ggplot2::geom_line(data = median_data_across_samples_df,
mapping = aes(y = median_rates, x = focal_time,
group = trait_value, col = trait_value),
alpha = 1.0,
linewidth = 1.5) +
# Plot div = 0 line
ggplot2::geom_hline(yintercept = 0, linewidth = 1.0, linetype = "dashed") +
# Set plot title +
ggplot2::ggtitle(label = paste0(rate_type_label, " rates per ranges through time")) +
# Set axes labels
ggplot2::xlab("Time") +
ggplot2::ylab(paste0(rate_type_label, " rates\n[Events / lineage / My]")) +
# Prevent rate Y-scale to expand
ggplot2::scale_y_continuous(expand = c(0, 0)) +
# Reverse time scale
ggplot2::scale_x_continuous(transform = "reverse",
limits = rev(time_range)) +
# Adjust fill scheme and legend
# ggplot2::scale_fill_discrete("Ranges") +
ggplot2::scale_fill_manual(name = "Ranges", values = colors_per_levels) +
# Adjust color scheme and legend
# ggplot2::scale_color_discrete(name = "Ranges") +
ggplot2::scale_color_manual(name = "Ranges", values = colors_per_levels) +
# Remove fill legend
ggplot2::guides(fill = "none") +
# Adjust aesthetics
ggplot2::theme(
plot.margin = ggplot2::margin(0.3, 0.5, 0.5, 0.5, "inches"), # trbl
panel.grid.major = ggplot2::element_line(color = "grey70", linetype = "dashed", linewidth = 0.3),
panel.background = ggplot2::element_rect(fill = NA, color = NA),
plot.title = ggplot2::element_text(size = 20, hjust = 0.5, color = "black",
margin = ggplot2::margin(b = 10, t = 5)),
legend.title = ggplot2::element_text(size = 16, margin = ggplot2::margin(b = 5)),
legend.position = "right",
# legend.position = "inside",
# legend.position.inside = c(0.15, 0.2),
legend.text = ggplot2::element_text(size = 12),
legend.key = ggplot2::element_rect(colour = NA, fill = NA, linewidth = 5),
legend.key.size = ggplot2::unit(1.8, "line"),
legend.spacing.y = ggplot2::unit(0.5, "line"),
axis.title = ggplot2::element_text(size = 20, color = "black"),
axis.title.x = ggplot2::element_text(margin = ggplot2::margin(t = 10)),
axis.title.y = ggplot2::element_text(margin = ggplot2::margin(r = 12)),
axis.line = ggplot2::element_line(linewidth = 1.0),
axis.ticks.length = ggplot2::unit(8, "pt"),
axis.text = ggplot2::element_text(size = 18, color = "black"),
axis.text.x = ggplot2::element_text(margin = ggplot2::margin(t = 5)),
axis.text.y = ggplot2::element_text(margin = ggplot2::margin(r = 5)))
}
}
## Display plot if requested
if (display_plot)
{
print(rates_TT_ggplot)
}
## Export plot if requested
if (!is.null(PDF_file_path))
{
cowplot::save_plot(plot = rates_TT_ggplot,
filename = PDF_file_path,
base_height = 8, base_width = 14)
}
## Build output
output <- list()
## Store ggplot
output$rates_TT_ggplot <- rates_TT_ggplot
## Store melted df if requested
if (return_mean_data_per_samples_df)
{
output$mean_data_per_samples_df <- as.data.frame(mean_data_per_samples_df)
}
if (return_median_data_across_samples_df)
{
output$median_data_across_samples_df <- as.data.frame(median_data_across_samples_df)
}
## Return output
return(invisible(output))
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.