R/gl.run.eems.r

Defines functions gl.run.eems

Documented in gl.run.eems

#' Runs the EEMS algorithm  (Estimating Effective Migration Surfaces) on a 
#' genlight object.
#'
#' @description
#' This function runs the EEMS algorithm on a genlight object. The EEMS
#'  algorithm is a spatially explicit model that estimates effective migration 
#'  surfaces (EEMS) from genetic data. The EEMS algorithm is implemented in 
#'  C++, hence it is necessary to have the binary downloaded and the function 
#'  needs to point to this file via the path specified in eems.path. The binary
#'   is call runeems_snps[.exe] and can be downloaded from the github site of
#'    dartRverse (https://github.com/green-striped-gecko/dartRverse/tree/main/binaries).
#'
#' @param x Name of the genlight object containing the SNP data [required].
#' @param eems.path Path to the folder containing the eems executable
#'  [default working directory ("./")].
#' @param buffer Buffer distance for all the elements [default 10000].
#' @param nDemes The approximate number of demes in the population graph 
#' [default 500].
#' @param diploid Whether the organism is diploid [default TRUE].
#' @param numMCMCIter Number of MCMC iterations [default 10000].
#' @param numBurnIter Number of burn-in iterations to discard at the start 
#' [default 2000].
#' @param numThinIter Number of iterations to thin between two writing steps
#'  [default 9].
#' @param seed 	An integer used to seed the random number generator 
#' [default NULL].
#' @param dpi Resolution of the contour raster [default 250].
#' @param add_grid A logical value indicating whether to add the population 
#' grid or not [default FALSE].
#' @param col_grid The color of the population grid [default gray ("#BBBBBB")].
#' @param add_demes A logical value indicating whether to add the observed 
#' demes or not [default FALSE].
#' @param col_demes The color of the demes [default black ("#000000")].
#' @param add_outline A logical value indicating whether to add the habitat 
#' outline or not [default FALSE].
#' @param col_outline The color of the habitat outline 
#' [default white ("#FFFFFF")].
#' @param eems_colors The EEMS color scheme as a vector of colors, ordered 
#' from low to high [default NULL].
#' @param plot.colors.pop A color palette for population plots or a list with
#' as many colors as there are populations in the dataset
#' [default gl.colors("dis")].
#' @param out.dir Path where to save the output file. Use outpath=getwd() or 
#' out.dir='.' when calling this function to direct output files to your 
#' working or current directory [default tempdir(), mandated by CRAN].
#' @param plot.dir Directory to save the plot RDS files 
#' [default as specified by the global working directory or tempdir()].
#' @param plot.file Name for the RDS binary file to save (base name only, 
#' exclude extension) [default NULL].
#' @param verbose Verbosity: 0, silent or fatal errors; 1, begin and end; 2, 
#' progress log; 3, progress and results summary; 5, full report 
#' [default NULL, unless specified using gl.set.verbosity].
#' @param cleanup Whether to delete intermediate files [default TRUE].
#' @param ... Extra parameters to add to function reemsplots2::make_eems_plots.
#' 
#' @details
#' Set the Number of MCMC iterations to 2 million for an initial run. If the 
#' posterior trace is still trending, lengthen the chain. Set the number of 
#' initial burnin iterations at 50% of total iterations. Set the iterations to 
#' thin between writes so that total/thin is around 100–200; a 10000‑step 
#' thinning interval is a practical default.
#' 
#' Choose the number of demes to match geographic scale: 100–250 for local or 
#' island studies and 300–500 for continental datasets. Run time grows 
#' cubically with number of demes, so anything above 1000 rarely pays off. 
#' 
#' Draw the habitat polygon with a small buffer (in meters), so every sample 
#' sits at least one grid spacing inside the edge. A 5–10 % expansion of the 
#' sample bounding box (or a few kilometers for fine‑scale work) is usually 
#' adequate.
#' 
#' For plots, use a raster resolution of 600 dpi, this is publication‑quality. 
#' Drop to 300 dpi for quick diagnostics. Higher detail—higher resolution 
#' affects file size, not the inference itself.
#' 
#' @return A list of contour plots of migration and diversity rates as well as
#' several diagnostic plots. It is a good idea to examine all these figures,
#' which is why they are generated by default. Please check the examples how to
#' customise the figures.
#' \describe{
#'  \item{mrates01}{Effective migration surface. This contour plot visualises
#'   the estimated effective migration rates \code{m}, on the log10 scale after
#'    mean centering.}
#'  \item{mrates02}{Posterior probability contours \code{P(log(m) > 0) = p} 
#'  and \code{P(log(m) < 0) = p} for the given probability level \code{p}. 
#'  Since migration rates are visualised on the log10 scale after mean 
#'  centering, 0 corresponds to the overall mean migration rate. This contour 
#'  plot emphasizes regions with effective migration that is significantly 
#'  higher/lower than the overall average.}
#'  \item{qrates01}{Effective diversity surface. This contour plot visualises 
#'  the estimated effective diversity rates \code{q}, on the log10 scale after
#'   mean centering.}
#'  \item{qrates02}{Posterior probability contours \code{P(log(q) > 0) = p} 
#'  and \code{P(log(q) < 0) = p}. Similar to \code{mrates02} but applied to 
#'  the effective diversity rates.}
#'  \item{rdist01}{Scatter plot of the observed vs the fitted between-deme
#'   component of genetic dissimilarity, where one point represents a pair of 
#'   sampled demes.}
#'  \item{rdist02}{Scatter plot of the observed vs the fitted within-deme
#'   component of genetic dissimilarity, where one point represents a 
#'   sampled deme.}
#'  \item{rdist03}{Scatter plot of observed genetic dissimilarities between 
#'  demes vs observed geographic distances between demes.}
#'  \item{pilogl01}{Posterior probability trace}
#' }
#' @export
#' @importFrom grDevices chull
#' @importFrom utils write.table
#' @importFrom dismo Mercator
#' @importFrom stats runif
#' @author Bernd Gruber  & Robyn (bugs? Post to
#' \url{https://groups.google.com/d/forum/dartr})
#' @references
#' \itemize{
#' \item Petkova D (2024). _reemsplots2: Generate plots to inspect and 
#' visualize the results of EEMS_. R package version 0.1.0,
#' \url{https://github.com/dipetkov/eems}.
#' \item 
#' D Petkova, J Novembre, M Stephens. Visualizing spatial population structure
#' with estimated effective migration surfaces. Nature Genetics 48, 
#' 94 -- 100 (2016). \doi{10.1038/ng.3464}.
#' }
#' @examples
#'  \dontrun{
#'  # This example needs a binary (runeems_snps[.exe]) specific to your 
#'  # operating system  to run
#'  eems <- gl.run.eems(bandicoot.gl, eems.path = "d:/downloads/eems/")
#'  print(eems[[1]])
#'  }
#'

gl.run.eems <- function(x,
                        eems.path = "./",
                        buffer = 10000,
                        nDemes = 500,
                        diploid = TRUE,
                        numMCMCIter = 10000,
                        numBurnIter = 2000,
                        numThinIter = 9,
                        seed = NULL,
                        dpi = 250,
                        add_grid = FALSE,
                        col_grid = "#BBBBBB", 
                        add_demes = FALSE, 
                        col_demes = "#000000",
                        add_outline = FALSE,
                        col_outline = "#FFFFFF", 
                        eems_colors = NULL,
                        plot.colors.pop = gl.colors("dis"),
                        out.dir = NULL,
                        plot.dir = NULL,
                        plot.file = NULL,
                        verbose = NULL,
                        cleanup = TRUE,
                        ...) {
  #util function to calculate similarities
  bed2diffs_v2 <- function(Geno) {
    nIndiv <- nrow(Geno)
    nSites <- ncol(Geno)
    Miss <- is.na(Geno)
    ## Impute NAs with the column means (= twice the allele frequencies)
    Mean <- matrix(
      colMeans(Geno, na.rm = TRUE),
      ## a row of means
      nrow = nIndiv,
      ncol = nSites,
      byrow = TRUE) 
    ## a matrix with nIndiv identical rows of means
    Mean[Miss == 0] <- 0 
    ## Set the means that correspond to observed genotypes to 0
    Geno[Miss == 1] <- 0 
    ## Set the missing genotypes to 0 (used to be NA)
    Geno <- Geno + Mean
    ## Compute similarities
    Sim <- Geno %*% t(Geno) / nSites
    SelfSim <- diag(Sim) ## self-similarities
    vector1s <- rep(1, nIndiv) ## vector of 1s
    ## This chunk generates a `diffs` matrix
    Diffs <-
      SelfSim %*% t(vector1s) + vector1s %*% t(SelfSim) - 2 * Sim
    Diffs
  }
  
  # CHECK IF PACKAGES ARE INSTALLED
  pkg <- "reemsplots2"
  if (!(requireNamespace(pkg, quietly = TRUE))) {
    cat(
      error(
        "Package",
        pkg,
        " needed for this function to work. Please install it using: \n
    devtools::install_github('dipetkov/reemsplots2')"
      )
    )
    return(-1)
  }
  
  pkg <- "sf"
  if (!(requireNamespace(pkg, quietly = TRUE))) {
    cat(error(
      "Package",
      pkg,
      " needed for this function to work. Please install it.\n" ))
    return(-1)
  }
  
  pkg <- "dismo"
  if (!(requireNamespace(pkg, quietly = TRUE))) {
    cat(error(
      "Package",
      pkg,
      " needed for this function to work. Please install it.\n"
    ))
    return(-1)
  } else {
    # SET VERBOSITY
    verbose <- gl.check.verbosity(verbose)
    
    # SET WORKING DIRECTORY
    plot.dir <- gl.check.wd(plot.dir, verbose = 0)
    
    #out.dir
    
    if (is.null(out.dir))
      out.dir <- tempdir()
    
    # FLAG SCRIPT START
    funname <- match.call()[[1]]
    utils.flag.start(func = funname,
                     build = "v.2023.2",
                     verbose = verbose)
    
    # CHECK DATATYPE
    if (!is.null(x)) {
      dt <- utils.check.datatype(x, verbose = 0)
    }
    
    # FUNCTION SPECIFIC ERROR CHECKING
    if (is.null(plot.file)){
      plot.file <- "eems"
    }
    
    #removing loci with all missing data
    x <- gl.filter.allna(x, verbose = 0)
    
    # DO THE JOB
    # create dissimilarity matrix
    D <- bed2diffs_v2(as.matrix(x))
    
    # Write dissimilarity matrix
    write.table(
      D,
      file.path(tempdir(), paste0(plot.file, ".diffs")),
      col.names = FALSE,
      row.names = FALSE,
      quote = FALSE
    )
    
    write.table(
      x = matrix(
        c(
          paste0("datapath = ", file.path(tempdir(), plot.file)),
          paste0("mcmcpath = ", file.path(
            tempdir(), paste0("data_", plot.file)
          )),
          paste0("nIndiv = ", nInd(x)),
          paste0("nSites = ", nLoc(x)),
          paste0("nDemes = ", nDemes),
          paste0("diploid = ", diploid),
          paste0("numMCMCIter = ", format(numMCMCIter, scientific = FALSE)),
          paste0("numBurnIter = ", format(numBurnIter, scientific = FALSE)),
          paste0("numThinIter = ", format(numThinIter, scientific = FALSE))
        ),
        nrow = 9,
        ncol = 1
      ),
      file = file.path(tempdir(), paste0("param_", plot.file, ".ini")),
      row.names = FALSE,
      quote = FALSE,
      col.names = FALSE
    )
    
    #### create datapath file (outer polygon)
    y <- NULL
    
    ll <- data.frame(x = x@other$latlon$lon,
                     y = x@other$latlon$lat)
    xy <- dismo::Mercator(ll)
    # plot(xy)
    hpts <- chull(xy)
    hpts <- c(hpts, hpts[1])
    poly <- xy[hpts,]
    
    p <- sf::st_polygon(list(as.matrix(poly)))
    pbuf <- sf::st_buffer(p, buffer)
    plot(pbuf,
         axes = TRUE,
         border = "green",
         lwd = 2)
    plot(p, add = TRUE, col = "red")
    points(xy, pch = 20, col = "blue")
    
    pxy <- sf::st_coordinates(pbuf)[, 1:2]
    
    # Write outer
    
    write.table(
      x = pxy,
      quote = FALSE,
      file = file.path(tempdir(), paste0(plot.file, ".outer")),
      row.names = FALSE,
      col.names = FALSE
    )
    
    #write coordinates
    
    write.table(
      x = xy,
      quote = FALSE,
      file = file.path(tempdir(), paste0(plot.file, ".coord")),
      row.names = FALSE,
      col.names = FALSE
    )
    
    if (is.null(seed)){
      seed <- round(runif(1, 1, 1000000))
    }
    
    if (Sys.info()["sysname"] == "Windows") {
      prog <- "runeems_snps.exe"
      cmd <-
        paste0(
          "runeems_snps.exe --params ",
          paste0("param_", plot.file, ".ini"),
          paste0(" --seed ", seed)
        )
    }
    
    if (Sys.info()["sysname"] == "Linux")  {
      prog <- "runeems_snps"
      cmd <-
        paste0(
          "./runeems_snps --params ",
          paste0("param_", plot.file, ".ini"),
          paste0(" --seed ", seed)
        )
    }
    
    if (Sys.info()["sysname"] == "Darwin") {
      prog <- "runeems_snps"
      cmd <-
        paste0(
          "./runeems_snps --params ",
          paste0("param_", plot.file, ".ini"),
          paste0(" --seed ", seed)
        )
    }
    
    # check if file program can be found
    if (file.exists(file.path(eems.path, prog))) {
      ff <- file.copy(file.path(eems.path, prog),
                      to = tempdir(),
                      overwrite = TRUE)
    } else {
      cat(
        error(
          "  Cannot find",
          prog,
          "in the specified folder given by eems.path:",
          eems.path,
          "\n"
        )
      )
      stop()
    }
    
    # change into tempdir (run it there)
    old.path <- getwd()
    setwd(tempdir())
    on.exit(setwd(old.path))
    ### run eems
    if (Sys.info()["sysname"] == "Linux"){
      system("chmod +x runeems_snps")
    }
    
    #cmd <- paste0(paste0(prog,"  --params ",  paste0("param_", plot.file, ".ini "), paste0("--seed ",seed )))
    
    system(cmd)
    
    eems_results <- file.path(tempdir(), paste0("data_", plot.file))
    eems_files <- list.files(tempdir(), pattern = plot.file)
    if (out.dir != tempdir()) {
      file.copy(
        eems_results,
        to = out.dir,
        overwrite = TRUE,
        recursive = TRUE
      )
      file.copy(eems_files, to = out.dir, overwrite = TRUE)
    }
    if (!exists('make_eems_plots', mode="function")) {make_eems_plots <- function() return (-1);
    error("You need to load the package reemsplot2 from github via:\n
          devtools::install_github('dipetkov/reemsplots2 \n")}  
    p8 <- make_eems_plots(mcmcpath =  eems_results,
                                       longlat = TRUE,
                                       dpi = dpi,
                                       add_grid = add_grid,
                                       col_grid = col_grid, 
                                       add_demes = add_demes, 
                                       col_demes = col_demes,
                                       add_outline = add_outline,
                                       col_outline = col_outline, 
                                       eems_colors = eems_colors
    )
    
    # if pop colors is a palette
    if (is(plot.colors.pop, "function")) {
      colors_pops <- plot.colors.pop(length(levels(pop(x))))
    }
    # if pop colors is a vector
    if (!is(plot.colors.pop, "function")) {
      colors_pops <- plot.colors.pop
    }
    
    xy_plot <-
      data.frame(x = xy[, 1], y = xy[, 2], pop = as.character(pop(x)))
    p8[[1]] <- p8$mrates01 +
      geom_point(data = xy_plot, aes(x = x, y = y, color = pop)) +
      scale_color_manual(values = colors_pops) +
      coord_equal()
    p8[[2]]  <- p8$mrates02 +
      geom_point(data = xy_plot, aes(x = x, y = y, color = pop)) +
      scale_color_manual(values = colors_pops) +
      coord_equal()
    p8[[3]]  <- p8$qrates01 +
      geom_point(data = xy_plot, aes(x = x, y = y, color = pop)) +
      scale_color_manual(values = colors_pops) +
      coord_equal()
    p8[[4]]  <- p8$qrates02 +
      geom_point(data = xy_plot, aes(x = x, y = y, color = pop)) +
      scale_color_manual(values = colors_pops)  +
      coord_equal()
    
  print(p8)
    
    if (cleanup) {
      unlink(eems_results, recursive = TRUE)
      unlink(eems_files, recursive = TRUE)
    }
    
    # Optionally save the plot ---------------------
    
    if (!is.null(plot.file)) {
      tmp <- utils.plot.save(p8,
                             dir = plot.dir,
                             file = plot.file,
                             verbose = verbose)
    }
    return(p8)
    
  }
}

Try the dartR.spatial package in your browser

Any scripts or data that you put into this service are public.

dartR.spatial documentation built on March 16, 2026, 9:07 a.m.