Description Usage Arguments Details Value Note Author(s) References Examples
Bayesian Phylogeographic and Ecological Clustering (BPEC) is aimed at identifying genetically, geographically and ecologically distinct population clusters and drawing inferences about ancestral locations. Given a dataset of DNA sequences (nonrecombinant, typically mtDNA) and their respective geographical locations (longitude and latitude) along with additional environmental or phenotypic characteristics, the algorithm simultaneously draws inferences about the genealogy (in the form of a haplotype tree) and the population clustering. In addition, the algorithm identifies locations with high posterior probability of being ancestral.
1 2 3 4 5 6 7 8 9 10 11  bpec(seqsFile, coordsFile, dims = 2, iter = 100000, postSamples = 100,
maxMig = 5, ds = 0, colorCode = c(7,5,6,3,2,8))
## S3 method for class 'bpec'
plot(x, GoogleEarth = 0, colorCode = c(7,5,6,3,2,8),...)
## S3 method for class 'bpec'
summary(object,...)
## S3 method for class 'bpec'
print(x,...)
## S3 method for class 'bpec'
mean(x,...)

x 
R object from 
object 
R object from 
colorCode 
A vector of color codes to use, ideally the same ones used in bpec.contourPlot. 
seqsFile 
The name of the NEXUS file in full, eg "SeqsFile.nex". 
coordsFile 
the name of the coordinate and sequence file in full, eg "CoordsLocs.txt". 
maxMig 
The maximum number of migration events (this means that the maximum number of clusters will be 
iter 
The number of iterations for the MCMC sampler, must be a multiple of 10. You will need quite a large number here, like 100,000. Two MCMC chains will run, after which convergence is checked. If convergence has not been reached, the output will say "NO CONVERGENCE" and you should increase the number of iterations. 
ds 
This represents the parsimony relaxation parameter, with 0 being the minimum. Generally the higher 
postSamples 
How many posterior samples per chain to save for use postprocessing. A value of at least 
dims 
The dimension, 2 for purely geographical data, +1 for each covariate (for example if environmental or phenotypic characteristics are also available). 
GoogleEarth 
If 1, .kml files are produced which can be opened with GoogleEarth. 
... 
Other default options. 
bpec requires 2 input files in order to run:
haplotypes.nex
:
The sequence file in NEXUS format. Sequence labels should either be integers, or contain unique integers which correspond to the labels in the CoordsLocsFile.txt
. For example, '1', '1_label', 'label1_label' will all be treated as haplotype 1. NOTE: bpec will currently ignore unknown nucleotides in the inference.
coordsLocsFile.txt
:
For each observation, the coordinates (latitude and longitude, please use a +/ to indicate W or E), any other environmental or phenotypic covariates (the latitude and longitude MUST come first), plus the ID number of the haplotype (must match the number in the sequence NEXUS file). If more than one haplotype were found at a single location, these can be entered one after the other, eg:
1 2 3  36.88 5.42 24 25
37.00 3.98 245 251 243 142 143 244 246 247
43.35 1.48 153

so, in the first location (lat/long 36.88, 5.42) you have 2 sampled individuals with haplotypes 24,25, in the second location eight individuals etc. Sequences don't necessarily need to be collapsed onto haplotypes, the program should take care of it.
The main function is bpec.mcmc
and runs a Markov chain Monte Carlo sampler in order to obtain posterior samples for all the parameters simultaneously: the haplotype tree, the ancestral nodes, the number of population clusters as well as their means and covariances.
Three plotting functions are available. bpec.contourPlot
shows the inferred population clusters superimposed on a world map. bpec.treePlot
shows the Maximum A Posteriori rooted haplotype tree, indicating posterior cluster membership and number of times each haplotype was sampled. It can also output .kml files which can be loaded into Google Earth. bpec.CovariatesPlot
shows the posterior density estimates for each additional covariate of each cluster.
bpec
object which can be analysed using input()
, data()
, clust()
, tree()
and mcmc()
.
bpec.mcmc
uses cexcept.h 2.0.1
(an interface for exceptionhandling in ANSI C) developed by Adam M. Costello and Cosmin Truta.
Ioanna Manolopoulou <[email protected]>, Axel Hille <[email protected]>
I. Manolopoulou and B.C. Emerson (2012). Phylogeographic ancestral inference using the coalescent model on haplotype trees. Journal of Computational Biology, 19(6), 745755.
I. Manolopoulou, L. Legarreta, B.C. Emerson, S. Brooks, and S. Tavar? (2011). A Bayesian approach to phylogeographic clustering. Interface focus, rsfs20110054.
S.P. Brooks, I. Manolopoulou, and B.C. Emerson (2007). Assessing the Effect of Genetic Mutation  A Bayesian Framework for Determining Population History from DNA Sequence Data. Bayesian Statistics 8. Oxford University Press.
Adam M. Costello and Cosmin Truta (2008) cexcept.h
exception handling interface in C, available at website http://www.nicemice.net/cexcept/.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93  ## if you want to load the `mini' example Brown Frog dataset
data(MacrocnemisRawSeqs)
data(MacrocnemisCoordsLocsMini)
rawSeqs < MacrocnemisRawSeqs
coordsLocs < MacrocnemisCoordsLocsMini
dims < 3 #this is 2 if you only have geographical longitude/latitude.
#(add 1 for each environmental or phenotypic covariate)
maxMig < 2 #you will need a higher maximum number of migrations, suggest 7
ds < 0 #start with ds=0 and increase to 1 and then to 2
iter < 1000 #you will need far more iterations for convergence, start with 100,000
postSamples < 100 #you will need at least 100 saved posterior samples
#run the Markov chain Monte Carlo sampler
bpecout < bpec.mcmc(rawSeqs,coordsLocs,maxMig,iter,ds,postSamples,dims)
par(mar=c(0,0,0,0),pty="m",mfrow=c(1,2)) #no plot margins, plot contours and tree sidebyside
# plot geographical cluster contour map
bpec.contourPlot(bpecout,GoogleEarth=0)
# plot tree network with cluster indicators
bpec.Tree < bpec.treePlot(bpecout)
# now also plot the environmental covariates
bpec.covariatesPlot(bpecout)
bpec.Geo < bpec.geoTree(bpecout,file="GoogleEarthTree.kml")
## Not run:
# if you want to load the example burnet moth dataset
data(TransalpinaRawSeqs)
data(TransalpinaCoordsLocs)
rawSeqs < TransalpinaRawSeqs
coordsLocs < TransalpinaCoordsLocs
##if you want to use your own dataset, use setwd() to enter the correct folder,
##then run the command below, changing the input parameters if necessary
#rawSeqs < bpec.loadSeq('haplotypes.nex')
#coordsLocs < bpec.loadCoords("coordsLocsFile.txt")
## to set phenotypic/environmental covariate names manually, use (as appropriate)
# colnames(CoordsLocs)[1:dims] < c('lat','long','cov1','cov2','cov3')
## where dims is the corresponding number of measurements available
## (2 for latitude and longitude only, add one for each additional available measurement)
dims < 2 #this is 2 if you only have geographical longitude/latitude.
#(add 1 for each environmental or phenotypic covariate)
maxMig < 5 #you will need a higher maximum number of migrations, suggest 7
ds < 0 #start with ds=0 and increase to 1 and then to 2
iter < 10000 #you will need far more iterations for convergence, start with 100,000
postSamples < 2 #you will need at least 100 saved posterior samples
#run the Markov chain Monte Carlo sampler
bpecout < bpec.mcmc(rawSeqs,coordsLocs,maxMig,iter,ds,postSamples,dims)
par(mar=c(0,0,0,0),pty="m",mfrow=c(1,2)) #No plot margins. Contours and tree sidebyside
# plot geographical cluster contour map
bpec.contourPlot(bpecout, GoogleEarth=0, mapType = 'plain')
# plot tree network with cluster indicators
bpec.Tree < bpec.treePlot(bpecout)
## if you want to load the example Brown Frog dataset
data(MacrocnemisRawSeqs)
data(MacrocnemisCoordsLocs)
rawSeqs < MacrocnemisRawSeqs
coordsLocs < MacrocnemisCoordsLocs
dims < 8 #this is 2 if you only have geographical longitude/latitude.
#(add 1 for each environmental or phenotypic covariate)
maxMig < 4 #you will need a higher maximum number of migrations, suggest 7
ds < 2 #start with ds=0 and increase to 1 and then to 2
iter < 10000 #you will need far more iterations for convergence, start with 100,000
postSamples < 2 #you will need at least 100 saved posterior samples
#run the Markov chain Monte Carlo sampler
bpecout < bpec.mcmc(rawSeqs,coordsLocs,maxMig,iter,ds,postSamples,dims)
par(mar=c(0,0,0,0),pty="m",mfrow=c(1,2)) #no plot margins, plot contours and tree sidebyside
# plot geographical cluster contour map
bpec.contourPlot(bpecout,GoogleEarth=0)
# plot tree network with cluster indicators
bpec.Tree < bpec.treePlot(bpecout)
# now also plot the environmental covariates
par(mfrow=c(2,3)) #split the plot window into 2x3 to fit all the covariates
bpec.covariatesPlot(bpecout)
bpec.Geo < bpec.geoTree(bpecout,file="GoogleEarthTree.kml")
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.