Typical 'a posteriori' TimeScaling Approaches For Paleontological Phylogenies
Description
Timescales an unscaled cladogram of fossil taxa using information on their
temporal ranges, using various methods. Also can resolve polytomies randomly
and output samples of randomlyresolved trees. As simple methods of timescaling
phylogenies of fossil taxa can have biasing effects on macroevolutionary analyses
(Bapst, 2014, Paleobiology), this function is largely retained for legacy purposes
and plotting applications. The timescaling methods implemented
by the functions listed here do not return realistic estimates of
divergence dates, users should investigate other timescaling methods such as cal3TimePaleoPhy
.
Usage
1 2 3 4 5 6 7 8 9 10  timePaleoPhy(tree, timeData, type = "basic", vartime = NULL, ntrees = 1,
randres = FALSE, timeres = FALSE, add.term = FALSE,
inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL,
noisyDrop = TRUE, plot = FALSE)
bin_timePaleoPhy(tree, timeList, type = "basic", vartime = NULL,
ntrees = 1, nonstoch.bin = FALSE, randres = FALSE, timeres = FALSE,
sites = NULL, point.occur = FALSE, add.term = FALSE,
inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL,
noisyDrop = TRUE, plot = FALSE)

Arguments
tree 
An unscaled cladogram of fossil taxa, of class 
timeData 
Twocolumn matrix of first and last occurrences in absolute
continuous time, with row names as the taxon IDs used on the tree. This means the
first column is very precise FADs (first appearance dates) and the second
column is very precise LADs (last appearance dates), reflect the precise points
in time when taxa first and last appear. If there is stratigraphic uncertainty in
when taxa appear in the fossil record, it is preferable to use the 'bin'
timescaling functions; however, see the argument 
type 
Type of timescaling method used. Can be "basic", "equal", "equal_paleotree_legacy", "equal_date.phylo_legacy" "aba", "zbla" or "mbl". Type="basic" by default. See details below. 
vartime 
Time variable; usage depends on the method 'type' argument. Ignored if type = "basic". 
ntrees 
Number of timescaled trees to output. If ntrees is greater than one and both randres is false and dateTreatment is neither 'minMax' or 'randObs', the function will fail and a warning is issued, as these arguments would simply produce multiple identical timescaled trees. 
randres 
Should polytomies be randomly resolved? By default,

timeres 
Should polytomies be resolved relative to the order of
appearance of lineages? By default, 
add.term 
If true, adds terminal ranges. By default, this function will
not add the ranges of taxa when timescaling a tree, so that the tips
correspond temporally to the first appearance datums of the given taxa. If

inc.term.adj 
If true, includes terminal ranges in branch length
estimates for the various adjustment of branch lengths under all methods
except 'basic' (i.e. a terminal length branch will not be treated as zero
length is this argument is 
dateTreatment 
This argument controls the interpretation of timeData. The default setting
'firstLast' treats the dates in timeData as a column of precise first and last appearances,
such that first appearances will be used to date nodes and last appearances will only be
called on if 
node.mins 
The minimum dates of internal nodes (clades) on a phylogeny can be set
using node.mins. This argument takes a vector of the same length as the number of nodes,
with dates given in the same order as nodes are ordered in the 
noisyDrop 
If 
plot 
If 
timeList 
A list composed of two matrices giving interval times and
taxon appearance dates. The rownames of the second matrix should be the taxon IDs,
identical to the 
nonstoch.bin 
If 
sites 
Optional two column matrix, composed of site IDs for taxon FADs and LADs. The sites argument allows users to constrain the placement of dates by restricting multiple fossil taxa whose FADs or LADs are from the same very temporally restricted sites (such as fossilrich Lagerstatten) to always have the same date, across many iterations of timescaled trees. To do this, simply give a matrix where the "site" of each FAD and LAD for every taxon is listed, as corresponding to the second matrix in timeList. If no sites matrix is given (the default), then it is assumed all fossil come from different "sites" and there is no shared temporal structure among the events. 
point.occur 
If true, will automatically produce a 'sites' matrix which forces all FADs and LADs to equal each other. This should be used when all taxa are only known from single 'point occurrences', i.e. each is only recovered from a single bed/horizon, such as a Lagerstatten. 
Details
TimeScaling Methods
These functions are an attempt to unify and collect previously used and discussed 'a posteriori' methods for timescaling phylogenies of fossil taxa. Unfortunately, it can be difficult to attribute some timescaling methods to specific references in the literature.
There are five main a posteriori approaches that can be used by timePaleoPhy
. Four of these
main types use some value of absolute time, chosen a priori, to timescale the tree.
This is handled by the argument vartime
, which is NULL by default and unused
for type "basic".
 "basic"
This most simple of timescaling methods ignores
vartime
and scales nodes so they are as old as the first appearance of their oldest descendant (Smith, 1994). This method produces many zerolength branches (Hunt and Carrano, 2010). "equal"
The 'equal' method defined by G. Lloyd and used in Brusatte et al. (2008) and Lloyd et al. (2012). Originally usable in code supplied by G. Lloyd, the algorithm is recreated here as closely as possible. This method works by increasing the time of the root divergence by some amount and then adjusting zerolength branches so that time on early branches is reapportioned out along those later branches equally. Branches are adjusted in order relative to the number of nodes separating the edge from the root, going from the furthest (most shallow) edges to the deepest edges. The choice of ordering algorithm can have an unanticipated large effect on the resulting timescaled trees created using "equal" and it appears that paleotree and functions written by G. Lloyd were not always consistent. The default option described here was only introduced into either software sources in August 2014. Thus, two legacy 'equal' methods are included in this function, so users can emulate older ordering algorithms for 'equal' which are now deprecated, as they do not match the underlying logic of the original 'equal' algorithm and do not minimize downpasses when adjusting branch lengths on the timescaled tree.
The root age can be adjusted backwards in time by either increasing by an arbitrary amount (via the
vartime
argument) or by setting the root age directly (via thenode.mins
argument); conversely, the function will also allow a user to opt to not alter the root age at all. "equal_paleotree_legacy"
Exactly like 'equal' above, except that edges are ordered instead by their depth (i.e. number of nodes from the root). This minor modified version was referred to as 'equal' for this
timePaleoPhy
function inpaleotree
until February 2014, and thus is included here solely for legacy purposes. This ordering algorithm does not minimize branch adjustment cycles, like the newer default offered under currently 'equal'. "equal_date.phylo_legacy"
Exactly like 'equal' above, except that edges are ordered relative to their time (ie. total edge length) from the root following the application of the 'basic' timescaling method, exactly as in G. Lloyd's original application. This was the method for sorting edges in the "equal" algorithm in G. Lloyd's
date.phylo
script andDatePhylo
in packagestrap
until August 2014, and was the default "equal" algorithm inpaleotree
'stimePaleoPhy
function from February 2014 until August 2014. This ordering algorithm does not minimize branch adjustment cycles, like the newer default offered under currently 'equal'. Due to how the presence of zerolength branches can make ordering branches based on time to be very unpredictable, this version of the 'equal' algorithm is highly not recommended. "aba"
All branches additive. This method takes the "basic" tree and adds vartime to all branches. Note that this timescaling method can warp the tree structure, leading to tips to originate out of order with the appearance data used.
 "zlba"
Zerolength branches additive. This method adds vartime to all zerolength branches in the "basic" tree. Discussed (possibly?) by Hunt and Carrano, 2010.Note that this timescaling method can warp the tree structure, leading to tips to originate out of order with the appearance data used.
 "mbl"
Minimum branch length. Scales all branches so they are greater than or equal to vartime, and subtract time added to later branches from earlier branches in order to maintain the temporal structure of events. A version of this was first introduced by Laurin (2004).
These functions cannot timescale branches relative to reconstructed
character changes along branches, as used by Lloyd et al. (2012). Please
see DatePhylo
in R package strap
for this functionality.
These functions will intuitively drop taxa from the tree with NA for range
or are missing from timeData
or timeList
. Taxa dropped from the tree will be
will be listed in a message output to the user. The same is done for taxa in
the timeList
object not listed in the tree.
As with many functions in the paleotree
library, absolute time is always
decreasing, i.e. the present day is zero.
As of August 2014, please note that the branchordering algorithm used in 'equal' has changed
to match the current algorithm used by DatePhylo
in package strap
, and that two legacy
versions of 'equal' have been added to this function, respectively representing how timePaleoPhy
and DatePhylo
(and its predecessor date.phylo
) applied the 'equal' timescaling method.
Interpretation of Taxon Ages in timePaleoPhy
timePaleoPhy
is primarily designed for direct application to datasets where taxon first
and last appearances are precisely known in continuous time, with no stratigraphic
uncertainty. This is an uncommon form of data to have from the fossil record,
although not an impossible form (micropaleontologists often have very precise
range charts, for example). Instead, most data has some form of stratigraphic uncertainty. However, for some groups,
the more typical 'first' and 'last' dates found in the literature or in databases represent the minimum
and maximum absolute ages for the fossil collections that a taxon is known
is known from. Presumably, the first and last appearances of that taxon in
the fossil record is at unknown dates within these bounds.
As of paleotree version 2.0. the treatment of taxon ages in timePaleoPhy
is handled by the argument dateTreatment
.
By default, this argument is set to 'firstLast' which means the matrix of ages are treated
as precise first and last appearance dates (i.e. FADs and LADs). The earlier FADs will be used
to calibrate the node ages, which could produce fairly nonsensical results if these are 'minimum'
ages instead and reflect age uncertainty. Alternatively, dateTreatment
can be set to 'minMax'
which instead treats taxon age data as minimum and maximum bounds on a single point date.
These point dates, if the minimum and maximum bounds option is selected,
are chose under a uniform distribution. Many timescaled trees should be created to approximate
the uncertainty in the dates. Additionally, there is a third option for dateTreatment
:
users may also make it so that the 'times of observation'
of trees are uncertain, such that the tips of the tree (with terminal ranges added) should
be randomly selected from a uniform distribution. Essentially, this third option treats the
dates as first and last appearances, but treats the first appearance dates as known and
fixed, but the 'last appearance' dates as unknown. In previous versions of paleotree,
this third option was enacted with the argument rand.obs
, which has been removed for
clarity.
Interpretation of Taxon Ages in bin_timePaleoPhy
As an alternative to using timePaleoPhy
, bin_timePaleoPhy
is a wrapper of
timePaleoPhy
which produces timescaled trees for datasets which only have
interval data available. For each output tree, taxon first and last appearance
dates are placed within their listed intervals under a uniform distribution.
Thus, a large sample of timescaled trees will approximate the uncertainty in
the actual timing of the FADs and LADs. In some ways, treating taxonomic age uncertainty
may be more logical via bin_timePaleoPhy
, as it is tied to specific interval bounds,
and there are more options available for certain types of age uncertainty, such as for cases
where specimens come from the same fossil site.
The input timeList
object for bin_timePaleoPhy
can have overlapping
(i.e. nonsequential) intervals, and intervals of uneven size. Taxa alive in the modern should be listed as last
occurring in a time interval that begins at time 0 and ends at time 0. If taxa
occur only in single collections (i.e. their first and last appearance in the
fossil record is synchronous, the argument point.occur will force all taxa
to have instantaneous durations in the fossil record. Otherwise, by default,
taxa are assumed to first and last appear in the fossil record at different points
in time, with some positive duration. The sites matrix can be used to force
only a portion of taxa to have simultaneous first and last appearances.
By setting the argument nonstoch.bin to TRUE for bin_timePaleoPhy
, the dates are NOT
stochastically pulled from uniform bins but instead FADs are assigned to the
earliest time of whichever interval they were placed in and LADs are placed
at the most recent time in their placed interval. This option may be useful
for plotting. The sites argument becomes arbitrary if nonstoch.bin = TRUE
.
If timeData
or the elements of timeList
are actually data.frames
(as output
by read.csv
or read.table
), these will be coerced to a matrix.
Tutorial
A tutorial for applying the timescaling functions in paleotree, along with an example using real (graptolite) data, can be found here: http://nemagraptus.blogspot.com/2013/06/atutorialtocal3timescalingusing.html
Value
The output of these functions is a timescaled tree or set of
timescaled trees, of either class phylo
or multiphylo
, depending on the
argument ntrees. All trees are output with an element $root.time. This is
the time of the root on the tree and is important for comparing patterns
across trees. Note that the $root.time element is defined relative to the
earliest first appearance date, and thus later tips may seem to occur in
the distant future under the 'aba' and 'zbla' timescaling methods.
Trees created with bin_timePaleoPhy
will output with some additional
elements, in particular $ranges.used, a matrix which records the
continuoustime ranges generated for timescaling each tree. (Essentially a
pseudotimeData
matrix.)
Note
Please account for stratigraphic uncertainty in your analysis.
Unless you have exceptionally resolved data, select an appropriate option in
dateTreatment
within timePaleoPhy
, use the more sophisticated
bin_timePaleoPhy
or code your own wrapper function of timePaleoPhy
that accounts for stratigraphic uncertainty in your dataset.
Author(s)
David W. Bapst, heavily inspired by code supplied by Graeme Lloyd and Gene Hunt.
References
Bapst, D. W. 2013. A stochastic ratecalibrated method for timescaling phylogenies of fossil taxa. Methods in Ecology and Evolution. 4(8):724733.
Bapst, D. W. 2014. Assessing the effect of timescaling methods on phylogenybased analyses in the fossil record. Paleobiology 40(3):331351.
Brusatte, S. L., M. J. Benton, M. Ruta, and G. T. Lloyd. 2008 Superiority, Competition, and Opportunism in the Evolutionary Radiation of Dinosaurs. Science 321(5895):148591488.
Hunt, G., and M. T. Carrano. 2010 Models and methods for analyzing phenotypic evolution in lineages and clades. In J. Alroy, and G. Hunt, eds. Short Course on Quantitative Methods in Paleobiology. Paleontological Society.
Laurin, M. 2004. The Evolution of Body Size, Cope's Rule and the Origin of Amniotes. Systematic Biology 53(4):594622.
Lloyd, G. T., S. C. Wang, and S. L. Brusatte. 2012 Identifying Heterogeneity in Rates of Morphological Evolutio: Discrete Character Change in the Evolution of Lungfish(Sarcopterygii, Dipnoi). Evolution 66(2):330–348.
Smith, A. B. 1994 Systematics and the fossil record: documenting evolutionary patterns. Blackwell Scientific, Oxford.
See Also
cal3TimePaleoPhy
, binTimeData
,
multi2di
For an alternative timescaling function, which includes the 'ruta' method
that weights the timescaling of branches by estimates of character change
along with implementations of the 'basic' and 'equal' methods described here,
please see function DatePhylo
in package strap
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138  # examples with empirical data
#load data
data(retiolitinae)
#Can plot the unscaled cladogram
plot(retioTree)
#Can plot discrete time interval diversity curve with retioRanges
taxicDivDisc(retioRanges)
#Use basic timescaling (terminal branches only go to FADs)
ttree<bin_timePaleoPhy(tree=retioTree,timeList=retioRanges,type="basic",
ntrees=1, plot=TRUE)
#Use basic timescaling (terminal branches go to LADs)
ttree<bin_timePaleoPhy(tree=retioTree,timeList=retioRanges,type="basic",
add.term=TRUE, ntrees=1, plot=TRUE)
#mininum branch length timescaling (terminal branches only go to FADs)
ttree<bin_timePaleoPhy(tree=retioTree,timeList=retioRanges,type="mbl",
vartime=1, ntrees=1, plot=TRUE)
###################
# examples with simulated data
#Simulate some fossil ranges with simFossilRecord
set.seed(444)
record<simFossilRecord(p=0.1, q=0.1, nruns=1,
nTotalTaxa=c(30,40), nExtant=0)
taxa<fossilRecord2fossilTaxa(record)
#simulate a fossil record with imperfect sampling with sampleRanges
rangesCont < sampleRanges(taxa,r=0.5)
#let's use taxa2cladogram to get the 'ideal' cladogram of the taxa
cladogram < taxa2cladogram(taxa,plot=TRUE)
#Now let's try timePaleoPhy using the continuous range data
ttree < timePaleoPhy(cladogram,rangesCont,type="basic",plot=TRUE)
#plot diversity curve
phyloDiv(ttree)
#that tree lacked the terminal parts of ranges (tips stops at the taxon FADs)
#let's add those terminal ranges back on with add.term
ttree < timePaleoPhy(cladogram,rangesCont,type="basic",add.term=TRUE,plot=TRUE)
#plot diversity curve
phyloDiv(ttree)
#that tree didn't look very resolved, does it? (See Wagner and Erwin 1995 to see why)
#can randomly resolve trees using the argument randres
#each resulting tree will have polytomies randomly resolved in different ways using multi2di
ttree < timePaleoPhy(cladogram,rangesCont,type="basic",ntrees=1,randres=TRUE,
add.term=TRUE,plot=TRUE)
#notice well the warning it prints!
#we would need to set ntrees to a large number to get a fair sample of trees
#if we set ntrees>1, timePaleoPhy will make multiple timetrees
ttrees < timePaleoPhy(cladogram,rangesCont,type="basic",ntrees=9,randres=TRUE,
add.term=TRUE,plot=TRUE)
#let's compare nine of them at once in a plot
layout(matrix(1:9,3,3))
parOrig < par(no.readonly=TRUE)
par(mar=c(1,1,1,1))
for(i in 1:9){plot(ladderize(ttrees[[i]]),show.tip.label=FALSE,no.margin=TRUE)}
#they are all a bit different!
#we can also resolve the polytomies in the tree according to time of first appearance
#via the function timeLadderTree, by setting the argument 'timeres' to TRUE
ttree < timePaleoPhy(cladogram,rangesCont,type="basic",ntrees=1,timeres=TRUE,
add.term=TRUE,plot=TRUE)
#can plot the median diversity curve with multiDiv
layout(1);par(parOrig)
multiDiv(ttrees)
#compare different methods of timePaleoPhy
layout(matrix(1:6,3,2))
parOrig < par(no.readonly=TRUE)
par(mar=c(3,2,1,2))
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="basic",vartime=NULL,add.term=TRUE)))
axisPhylo();text(x=50,y=23,"type=basic",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="equal",vartime=10,add.term=TRUE)))
axisPhylo();text(x=55,y=23,"type=equal",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="aba",vartime=1,add.term=TRUE)))
axisPhylo();text(x=55,y=23,"type=aba",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="zlba",vartime=1,add.term=TRUE)))
axisPhylo();text(x=55,y=23,"type=zlba",adj=c(0,0.5),cex=1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type="mbl",vartime=1,add.term=TRUE)))
axisPhylo();text(x=55,y=23,"type=mbl",adj=c(0,0.5),cex=1.2)
layout(1);par(parOrig)
#using node.mins
#let's say we have (molecular??) evidence that node #5 is at least 1200 timeunits ago
#to use node.mins, first need to drop any unshared taxa
droppers < cladogram$tip.label[is.na(
match(cladogram$tip.label,names(which(!is.na(rangesCont[,1])))))]
cladoDrop < drop.tip(cladogram, droppers)
# now make vector same length as number of nodes
nodeDates < rep(NA, Nnode(cladoDrop))
nodeDates[5] < 1200
ttree1 < timePaleoPhy(cladoDrop,rangesCont,type="basic",
randres=FALSE,node.mins=nodeDates,plot=TRUE)
ttree2 < timePaleoPhy(cladoDrop,rangesCont,type="basic",
randres=TRUE,node.mins=nodeDates,plot=TRUE)
#Using bin_timePaleoPhy to timescale with discrete interval data
#first let's use binTimeData() to bin in intervals of 1 time unit
rangesDisc < binTimeData(rangesCont,int.length=1)
ttreeB1 < bin_timePaleoPhy(cladogram,rangesDisc,type="basic",ntrees=1,randres=TRUE,
add.term=TRUE,plot=FALSE)
#notice the warning it prints!
phyloDiv(ttreeB1)
#with timeorder resolving via timeLadderTree
ttreeB2 < bin_timePaleoPhy(cladogram,rangesDisc,type="basic",ntrees=1,timeres=TRUE,
add.term=TRUE,plot=FALSE)
phyloDiv(ttreeB2)
#can also force the appearance timings not to be chosen stochastically
ttreeB3 < bin_timePaleoPhy(cladogram,rangesDisc,type="basic",ntrees=1,
nonstoch.bin=TRUE,randres=TRUE,add.term=TRUE,plot=FALSE)
phyloDiv(ttreeB3)
# testing node.mins in bin_timePaleoPhy
ttree < bin_timePaleoPhy(cladoDrop,rangesDisc,type="basic",ntrees=1,
add.term=TRUE,randres=FALSE,node.mins=nodeDates,plot=TRUE)
# with randres = TRUE
ttree < bin_timePaleoPhy(cladoDrop,rangesDisc,type="basic",ntrees=1,
add.term=TRUE,randres=TRUE,node.mins=nodeDates,plot=TRUE)
#simple three taxon example for testing inc.term.adj
ranges1<cbind(c(3,4,5),c(2,3,1));rownames(ranges1)<paste("t",1:3,sep="")
clado1<read.tree(file=NA,text="(t1,(t2,t3));")
ttree1<timePaleoPhy(clado1,ranges1,type="mbl",vartime=1)
ttree2<timePaleoPhy(clado1,ranges1,type="mbl",vartime=1,add.term=TRUE)
ttree3<timePaleoPhy(clado1,ranges1,type="mbl",vartime=1,add.term=TRUE,inc.term.adj=TRUE)
layout(1:3)
ttree1$root.time;plot(ttree1);axisPhylo()
ttree2$root.time;plot(ttree2);axisPhylo()
ttree3$root.time;plot(ttree3);axisPhylo()
apply(ranges1,1,diff)
