R/make_eems_plots.R

#' Plot effective migration and diversity surfaces
#'
#' Given a vector of EEMS output directories, this function generates several figures to visualize EEMS results. It is a good idea to examine all these figures, which is why they are generated by default.
#' \describe{
#'  \item{mrates01}{effective migration surface. This contour plot visualizes the estimated effective migration rates \code{m}, on the log10 scale after mean centering.}
#'  \item{mrates02}{posterior probability contours \code{P(log(m) > 0) = p} and \code{P(log(m) < 0) = p} for the given probability level \code{p}. Since migration rates are visualized on the log10 scale after mean centering, 0 corresponds to the overall mean migration rate. This contour plot emphasizes regions with effective migration that is significantly higher/lower than the overall average.}
#'  \item{qrates01}{effective diversity surface. This contour plot visualizes the estimated effective diversity rates \code{q}, on the log10 scale after mean centering.}
#'  \item{qrates02}{posterior probability contours \code{P(log(q) > 0) = p} and \code{P(log(q) < 0) = p}. Similar to \code{mrates02} but applied to the effective diversity rates.}
#'  \item{rdist01}{scatter plot of the observed vs the fitted between-deme component of genetic dissimilarity, where one point represents a pair of sampled demes.}
#'  \item{rdist01}{scatter plot of the observed vs the fitted within-deme component of genetic dissimilarity, where one point represents a sampled deme.}
#'  \item{rdist03}{scatter plot of observed genetic dissimilarities between demes vs observed geographic distances between demes.}
#'  \item{pilogl01}{posterior probability trace}
#' }
#' @details
#' The function \code{make_eems_plots} will work given the results from a single EEMS run (one directory in \code{mcmcpath}) but it is better to run EEMS several times, randomly initializing the MCMC chain each time. In other words, simulate several realizations of the Markov chain and let each realization start from a different state in the parameter space (by using a different random seed).
##'
#' The \code{mrates} and \code{qrates} figures visualize (properties of) the effective migration and diversity rates across the habitat. The other figures can help to check that the MCMC sampler has converged (the trace plot \code{pilogl}) and that the EEMS model fits the data well (the scatter plots of genetic dissimilarities \code{rdist}).
#'
#' To describe the within-deme and between-deme components of genetic dissimilarity, let \code{D(a,b)} be the dissimilarity between one individual from deme \code{a} and another individual from deme \code{b}. Then the within-deme component for \code{a} and \code{b} is simply \code{D(a,a)} and \code{D(b, b)}, respectively. The between-deme component is \code{D(a,b) - [D(a,a) + D(b,b)] / 2} and it represents dissimilarity that is due to the spatial structure of the population and is not a consequence of the local diversity in the two demes.
#' @param mcmcpath A vector of EEMS output directories, for the same dataset. Warning: There is minimal checking that the directories all correspond to the same dataset.
#' @param longlat A logical value indicating whether the coordinates are given as pairs (longitude, latitude) or (latitude, longitude).
#' @param dpi Resolution of the contour raster. The default is 250.
#' @param add_grid A logical value indicating whether to add the population grid or not.
#' @param col_grid The color of the population grid. Defaults to gray.
#' @param add_outline A logical value indicating whether to add the habitat outline or not.
#' @param col_outline The color of the habitat outline. Defaults to white.
#' @param add_demes A logical value indicating whether to add the observed demes or not.
#' @param col_demes The color of the demes. Defaults to black.
#' @param eems_colors The EEMS color scheme as a vector of colors, ordered from low to high. Defaults to a DarkOrange to Blue divergent palette with six orange shades, white in the middle, six blue shades. Acknowledgement: The default color scheme is adapted from the \code{dichromat} package.
#' @param m_colscale A fixed range for log10-transformed migration rates. If the estimated rates fall outside the specified range, then the color scale is ignored. The default range is \code{[-2.5, +2.5]}.
#' @param q_colscale A fixed range for log10-transformed diversity rates. The default range is \code{-0.1, +0.1}.
#' @param add_abline Add the line \code{y = x} to the diagnostic scatter plots of observed vs fitted genetic dissimilarities.
#' @param prob_level A probability \code{p} to define the posterior probability contours \code{P(log(m) > 0) = p} and \code{P(log(m) < 0) = p}. Defaults to \code{0.9}.
#' @references Light A and Bartlein PJ (2004). The End of the Rainbow? Color Schemes for Improved Data Graphics. EOS Transactions of the American Geophysical Union, 85(40), 385.
#' @examples
#' # Use the provided example or supply the path to your own EEMS run.
#' mcmcpath <- system.file("extdata", "EEMS-barrier", package = "reemsplots2")
#'
#' # Generate contour plots of migration and diversity rates
#' # as well as several diagnostic plots
#' plots <- make_eems_plots(mcmcpath, longlat = TRUE)
#' names(plots)
#' \donttest{
#' # Save the various plots
#' library("ggplot2")
#' plotpath <- file.path(path.expand("~"), "EEMS-barrier")
#' ggsave(paste0(plotpath, "-mrates01.png"), plots$mrates01, dpi = 600,
#'        width = 6, height = 4)
#' ggsave(paste0(plotpath, "-mrates02.png"), plots$mrates02, dpi = 600,
#'        width = 6, height = 4)
#' ggsave(paste0(plotpath, "-qrates01.png"), plots$qrates01, dpi = 600,
#'        width = 6, height = 4)
#' ggsave(paste0(plotpath, "-qrates02.png"), plots$qrates02, dpi = 600,
#'        width = 6, height = 4)
#' ggsave(paste0(plotpath, "-rdist01.pdf"), plots$rdist01,
#'        width = 6.5, height = 6)
#' ggsave(paste0(plotpath, "-rdist02.pdf"), plots$rdist02,
#'        width = 6.5, height = 6)
#' ggsave(paste0(plotpath, "-rdist03.pdf"), plots$rdist03,
#'        width = 6.5, height = 6)
#' ggsave(paste0(plotpath, "-pilogl01.pdf"), plots$pilogl01,
#'        width = 7, height = 5)
#' }
#' @seealso \code{\link{plot_population_grid}}, \code{\link{plot_resid_heatmap}}, \code{\link{plot_voronoi_tiles}}
#' @export

make_eems_plots <- function(mcmcpath, longlat = TRUE, dpi = 250,
                            add_grid = FALSE, col_grid = "#BBBBBB",
                            add_demes = FALSE, col_demes = "#000000",
                            add_outline = FALSE, col_outline = "#FFFFFF",
                            eems_colors = NULL, prob_level = 0.9,
                            m_colscale = NULL, q_colscale = NULL,
                            add_abline = FALSE) {
  check_mcmcpath_contents(mcmcpath)
  func_params <- list(add_grid = add_grid, add_demes = add_demes,
                      add_outline = add_outline,
                      eems_colors = eems_colors, prob_level = prob_level,
                      col_grid = col_grid, col_demes = col_demes,
                      col_outline = col_outline,
                      m_colscale = m_colscale, q_colscale = q_colscale,
                      add_abline = add_abline)
  plot_params <- check_plot_params(func_params)
  dimns <- read_dimns(mcmcpath[1], longlat, dpi)
  plots <- list()

  p <- eems_contours(mcmcpath, dimns, longlat, plot_params, is_mrates = TRUE)
  plots$mrates01 <- p[[1]]
  plots$mrates02 <- p[[2]]
  p <- eems_contours(mcmcpath, dimns, longlat, plot_params, is_mrates = FALSE)
  plots$qrates01 <- p[[1]]
  plots$qrates02 <- p[[2]]

  dissimilarities <- pairwise_dist(mcmcpath, longlat, plot_params)
  p <- plot_pairwise_dissimilarities_(dissimilarities, add_abline)
  plots$rdist01 <- p[[1]]
  plots$rdist02 <- p[[2]]
  plots$rdist03 <- p[[3]]

  plots$pilogl01 <- plot_log_posterior(mcmcpath)
  plots
}

eems_contours <- function(mcmcpath, dimns, longlat, plot_params, is_mrates) {
  if (is_mrates)
    message("Generate effective migration surface ",
            "(posterior mean of m rates). ",
            "See plots$mrates01 and plots$mrates02.")
  else
    message("Generate effective diversity surface ",
            "(posterior mean of q rates). ",
            "See plots$qrates01 and plots$qrates02.")
  zrates <- rep(0, dimns$nmrks)
  pr_gt0 <- rep(0, dimns$nmrks)
  pr_lt0 <- rep(0, dimns$nmrks)
  niters <- 0
  # Loop over each directory in mcmcpath to average the contour plots
  for (path in mcmcpath) {
    voronoi <- read_voronoi(path, longlat, is_mrates)
    rslt <- tiles2contours(voronoi$tiles, voronoi$rates,
                           cbind(voronoi$xseed, voronoi$yseed),
                           dimns$marks, dimns$dist_metric)
    zrates <- zrates + rslt$zrates
    niters <- niters + rslt$niters
    pr_gt0 <- pr_gt0 + rslt$pr_gt0
    pr_lt0 <- pr_lt0 + rslt$pr_lt0
  }
  zrates <- zrates / niters
  pr_gt0 <- pr_gt0 / niters
  pr_lt0 <- pr_lt0 / niters
  p1 <- filled_eems_contour(dimns, zrates, plot_params, is_mrates)
  p2 <- filled_prob_contour(dimns, pr_gt0 - pr_lt0, plot_params, is_mrates)
  list(p1, p2)
}

plot_log_posterior <- function(mcmcpath) {
  message("Generate posterior probability trace. ",
          "See plots$pilog01.")
  rleid <- function(x) {
    r <- rle(x)
    rep(seq_along(r$lengths), r$lengths)
  }
  pl_df <- NULL
  for (path in mcmcpath) {
    pl <- read_matrix(file.path(path, "mcmcpilogl.txt"))
    pl_df <- bind_rows(pl_df, as_data_frame(pl) %>% mutate(path))
  }
  pl_df <- pl_df %>%
    setNames(c("pi", "logl", "path")) %>%
    mutate(mcmcpath = factor(rleid(path))) %>%
    group_by(mcmcpath) %>%
    mutate(iter = row_number(), pilogl = pi + logl)
  ggplot(pl_df, aes(x = iter, y = pilogl, color = mcmcpath)) +
    geom_path() +
    labs(x = "MCMC iteration  (after burn-in and thinning)",
         y = "log posterior",
         title = "Have the MCMC chains converged?",
         subtitle = "If not, restart EEMS and/or increase numMCMCIter") +
    theme_bw() +
    theme(panel.grid.minor = element_blank(),
          panel.grid.major.x = element_blank())
}

plot_pairwise_dissimilarities_ <- function(dissimilarities, add_abline) {
  message("Generate average dissimilarities within and between demes. ",
          "See plots$rdist01, plots$rdist02 and plots$rdist03.")
  p1 <- ggplot(dissimilarities$between %>% filter(size > 1),
               aes(fitted, obsrvd)) +
    geom_point(shape = 1) +
    theme_minimal() +
    labs(x = expression(paste("Fitted dissimilarity between demes  ",
                              Delta[alpha * beta], " - (",
                              Delta[alpha * alpha], " + ",
                              Delta[beta * beta], ") / 2")),
         y = expression(paste("Observed dissimilarity between demes  ",
                              D[alpha * beta], " - (",
                              D[alpha * alpha], " + ",
                              D[beta * beta], ") / 2")),
         title = expression(paste("Dissimilarities between pairs of ",
                                  "sampled demes (", alpha, ", ", beta, ")")),
         subtitle = paste("Singleton demes, if any, are excluded from this",
                          "plot (but not from EEMS)"))
  p2 <- ggplot(dissimilarities$within %>% filter(size > 1),
               aes(fitted, obsrvd)) +
    geom_point(shape = 1) +
    theme_minimal() +
    labs(x = expression(paste("Fitted dissimilarity within demes  ",
                              Delta[alpha * alpha])),
         y = expression(paste("Observed dissimilarity within demes ",
                              D[alpha * alpha])),
         title = expression(paste("Dissimilarities within sampled ",
                                  "demes ", alpha)),
         subtitle = paste("Singleton demes, if any, are excluded from ",
                          "this plot (but not from EEMS)"))
  p3 <- ggplot(dissimilarities$ibd %>% filter(size > 1),
               aes(fitted, obsrvd)) +
    geom_point(shape = 1) +
    theme_minimal() +
    labs(x = "Great circle distance between demes (km)",
         y = expression(paste("Observed dissimilarity between demes  ",
                              D[alpha * beta], " - (",
                              D[alpha * alpha], " + ",
                              D[beta * beta], ") / 2")),
         title = expression(paste("Dissimilarities between pairs of ",
                                  "sampled demes (", alpha, ", ", beta, ")")),
         subtitle = paste("Singleton demes, if any, are excluded from this",
                          "plot (but not from EEMS)"))
  if (add_abline) {
    p1 <- p1 + geom_smooth(method = "lm", se = FALSE)
    p2 <- p2 + geom_smooth(method = "lm", se = FALSE)
  }
  list(p1, p2, p3)
}
dipetkov/reemsplots2 documentation built on May 15, 2019, 8:47 a.m.