R package to calculate the genealogical divergence index (GDI) for species delimitation analyses using BPP. Requires a maximally-split guide tree. GDI described in Jackson et al. 2017.
Tested with BPP v.4.1.4
To download and install this package
install.packages("devtools")
library(devtools)
install_github("dmacguigan/gdiPipeline")
library(gdiPipeline)
Example script
# GDI pileine with Percina kusha example data
# https://github.com/dmacguigan/gdiPipeline/tree/master/example
library(gdiPipeline)
###### PIPELINE PARAMETERS ######
###### general parameters
wd="./gdiPipeline/example"
plotColors = c(brewer.pal(12, "Paired"), "black") # for later plots of GDI
nLoci = 10
threads = 8
nreps = 4
###### input files
# maximally split guide tree
# newick format, include semicolon after tree
treefile="Pkus.tree"
# list of prior combinations
# one line per set of priors
# each line should contain: tau_alpha tau_beta theta_alpha theta_beta
priors="priors.txt"
###### BPP-specific input files
# see the BPP user manual for details on how each file should be formatted
# https://github.com/bpp/bpp
map="Pkus.Imap.txt"
heredity = "heredity.txt"
loci = "m100p_10loci.phy.txt"
# name for BPP template control file, this will be created in step 1
ctl = "ctlTemplate.ctl"
###### PIPELINE STEPS ######
# STEP 1
# create BPP control file template for bppInputs function
BPPCtlTemplate(wd)
# STEP 2
# Using a text editor, modify parameters like "burnin", "samplefreq", and/or "nsample"
# in the ctlTemplate.ctl file if necessary before proceeding. You may also want to
# specify "finetune" values based on preliminary runs if mixing/convergence is an issue.
# STEP 3
# create directories and files for BPP runs
# specify priors and maximally split tree
bppInputs(wd, treefile, map,
priors,
heredity, loci,
ctl, nLoci,
threads, nreps)
# STEP 4
# create task file, each line with a different command to run BPP
bppTaskFile(wd)
# STEP 5
# run the task file generated by bppTaskFile
# ideally parallelize using a job array setup
# for example: https://docs.ycrc.yale.edu/clusters-at-yale/job-scheduling/dsq/
# you may need to change the line endings of "BPPTaskFile.txt" to Unix
# if you ran step 4 on a Windows machine
# and plan to use a Linux machine for running BPP
# STEP 6
# assess mixing and convergence of BPP analyses
# diagnostic files are written in each model directory
# this may take a while to run depending on the number of analyses and the MCMC chain lengths
burnin = 0.1 # proportion of MCMC generations to discard as burnin
checkConvergence(wd=wd, nreps=nreps, burnin=burnin)
# STEP 7
# summarize the bpp results
# caluclate GDI
# plot GDI for each replicate run and binning replicate runs
burnin = 0.1 # proportion of MCMC generations to discard as burnin
gdi <- bppSummarizeGDI(wd, plotColors, nreps, burnin)
# plot GDI estimates for all species in one figure
plotByPrior(gdi, wd, nreps, priors, plotWidth=8, plotHeight=5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.