timePaleoPhy: Typical 'a posteriori' Time-Scaling Approaches For...

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/timePaleoPhy.R


Time-scales an unscaled cladogram of fossil taxa using information on their temporal ranges, using various methods. Also can resolve polytomies randomly and output samples of randomly-resolved trees. As simple methods of time-scaling phylogenies of fossil taxa can have biasing effects on macroevolutionary analyses (Bapst, 2014, Paleobiology), this function is largely retained for legacy purposes and plotting applications. The time-scaling methods implemented by the functions listed here do not return realistic estimates of divergence dates, users should investigate other time-scaling methods such as cal3TimePaleoPhy.


timePaleoPhy(tree, timeData, type = "basic", vartime = NULL, ntrees = 1,
  randres = FALSE, timeres = FALSE, add.term = FALSE,
  inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL,
  noisyDrop = TRUE, plot = FALSE)

bin_timePaleoPhy(tree, timeList, type = "basic", vartime = NULL,
  ntrees = 1, nonstoch.bin = FALSE, randres = FALSE, timeres = FALSE,
  sites = NULL, point.occur = FALSE, add.term = FALSE,
  inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL,
  noisyDrop = TRUE, plot = FALSE)



An unscaled cladogram of fossil taxa, of class phylo. Tip labels must match the taxon labels in the respective temporal data.


Two-column matrix of first and last occurrences in absolute continuous time, with row names as the taxon IDs used on the tree. This means the first column is very precise FADs (first appearance dates) and the second column is very precise LADs (last appearance dates), reflect the precise points in time when taxa first and last appear. If there is stratigraphic uncertainty in when taxa appear in the fossil record, it is preferable to use the 'bin' time-scaling functions; however, see the argument dateTreatment.


Type of time-scaling method used. Can be "basic", "equal", "equal_paleotree_legacy", "equal_date.phylo_legacy" "aba", "zbla" or "mbl". Type = "basic" by default. See details below.


Time variable; usage depends on the method 'type' argument. Ignored if type = "basic".


Number of time-scaled trees to output. If ntrees is greater than one and both randres is false and dateTreatment is neither 'minMax' or 'randObs', the function will fail and a warning is issued, as these arguments would simply produce multiple identical time-scaled trees.


Should polytomies be randomly resolved? By default, timePaleoPhy does not resolve polytomies, instead outputting a time-scaled tree that is only as resolved as the input tree. If randres = T, then polytomies will be randomly resolved using multi2di from the package ape. If randres = T and ntrees = 1, a warning is printed that users should analyze multiple randomly-resolved trees, rather than a single such tree, although a tree is still output.


Should polytomies be resolved relative to the order of appearance of lineages? By default, timePaleoPhy does not resolve polytomies, instead outputting a time-scaled tree that is only as resolved as the input tree. If timeres = TRUE, then polytomies will be resolved with respect to time using the paleotree function timeLadderTree. See that functions help page for more information; the result of time-order resolving of polytomies generally does not differ across multiple uses, unlike use of multi2di.


If true, adds terminal ranges. By default, this function will not add the ranges of taxa when time-scaling a tree, so that the tips correspond temporally to the first appearance datums of the given taxa. If add.term = T, then the 'terminal ranges' of the taxa are added to the tips after tree is time-scaled, such that the tips now correspond to the last appearance datums.


If true, includes terminal ranges in branch length estimates for the various adjustment of branch lengths under all methods except 'basic' (i.e. a terminal length branch will not be treated as zero length is this argument is TRUE if the taxon at this tip has a non-zero duration). By default, this argument is FALSE and this function will not include the ranges of taxa when adjusting branch lengths, so that zero-length branches before first appearance times will be extended. An error is returned if this argument is true but type = "basic" or add.term = FALSE, as this argument is inconsistent with those argument options.


This argument controls the interpretation of timeData. The default setting 'firstLast' treats the dates in timeData as a column of precise first and last appearances, such that first appearances will be used to date nodes and last appearances will only be called on if add.term = TRUE. A second option, added by great demand, is 'minMax' which treats these dates as minimum and maximum bounds on single point dates. Under this option, all taxa in the analysis will be treated as being point dates, such that the first appearance is also the last. These dates will be pulled under a uniform distribution. If 'minMax' is used, add.term becomes meaningless, and the use of it will return an error message. A third option is 'randObs'. This assumes that the dates in the matrix are first and last appearance times, but that the desired time of observation is unknown. Thus, this is much like 'firstLast' except the effective time of observation (the taxon's LAD under 'firstLast') is treated an uncertain date, and is randomly sampled between the first and last appearance times. The FAD still is treated as a fixed number, used for dating the nodes. In previous versions of paleotree, this was called in timePaleoPhy using the argument rand.obs, which has been removed for clarity. This temporal uncertainty in times of observation might be useful if a user is interested in applying phylogeny-based approaches to studying trait evolution, but have per-taxon measurements of traits that come from museum specimens with uncertain temporal placement. With both arguments 'minMax' and 'randObs', the sampling of dates from random distributions should compel users to produce many time-scaled trees for any given analytical purpose. Note that 'minMax' returns an error in 'bin' time-scaling functions; please use 'points.occur' instead.


The minimum dates of internal nodes (clades) on a phylogeny can be set using node.mins. This argument takes a vector of the same length as the number of nodes, with dates given in the same order as nodes are ordered in the tree$edge matrix. Note that in tree$edge, terminal tips are given the first set of numbers (1:Ntip(tree)), so the first element of node.mins is the first internal node (the node numbered Ntip(tree)+1, which is generally the root for most phylo objects read by read.tree). Not all nodes need be given minimum dates; those without minimum dates can be given as NA in node.mins, but the vector must be the same length as the number of internal nodes in tree. These are minimum date constraints, such that a node will be forced to be at least as old as this date, but the final date may be even older depending on the taxon dates used, the time-scaling method applied, the vartime used and any other minimum node dates given (e.g. if a clade is given a very old minimum date, this will (of course) over-ride any minimum dates given for clades that that node is nested within). Although vartime does adjust the node age downwards when the equal method is used, if a user has a specific date they'd like to constrain the root to, they should use node.mins instead because the result is more predictable.


If TRUE (the default), any taxa dropped from tree due to not having a matching entry in the time data will be listed in a system message.


If TRUE, plots the input and output phylogenies.


A list composed of two matrices giving interval times and taxon appearance dates. The rownames of the second matrix should be the taxon IDs, identical to the tip.labels for tree. See details.


If TRUE, dates are not stochastically pulled from uniform distributions. See below for more details.


Optional two column matrix, composed of site IDs for taxon FADs and LADs. The sites argument allows users to constrain the placement of dates by restricting multiple fossil taxa whose FADs or LADs are from the same very temporally restricted sites (such as fossil-rich Lagerstatten) to always have the same date, across many iterations of time-scaled trees. To do this, simply give a matrix where the "site" of each FAD and LAD for every taxon is listed, as corresponding to the second matrix in timeList. If no sites matrix is given (the default), then it is assumed all fossil come from different "sites" and there is no shared temporal structure among the events.


If true, will automatically produce a 'sites' matrix which forces all FADs and LADs to equal each other. This should be used when all taxa are only known from single 'point occurrences', i.e. each is only recovered from a single bed/horizon, such as a Lagerstatten.


Time-Scaling Methods

These functions are an attempt to unify and collect previously used and discussed 'a posteriori' methods for time-scaling phylogenies of fossil taxa. Unfortunately, it can be difficult to attribute some time-scaling methods to specific references in the literature.

There are five main a posteriori approaches that can be used by timePaleoPhy. Four of these main types use some value of absolute time, chosen a priori, to time-scale the tree. This is handled by the argument vartime, which is NULL by default and unused for type "basic".


This most simple of time-scaling methods ignores vartime and scales nodes so they are as old as the first appearance of their oldest descendant (Smith, 1994). This method produces many zero-length branches (Hunt and Carrano, 2010).


The 'equal' method defined by G. Lloyd and used in Brusatte et al. (2008) and Lloyd et al. (2012). Originally usable in code supplied by G. Lloyd, the algorithm is recreated here as closely as possible. This method works by increasing the time of the root divergence by some amount and then adjusting zero-length branches so that time on early branches is re-apportioned out along those later branches equally. Branches are adjusted in order relative to the number of nodes separating the edge from the root, going from the furthest (most shallow) edges to the deepest edges. The choice of ordering algorithm can have an unanticipated large effect on the resulting time-scaled trees created using "equal" and it appears that paleotree and functions written by G. Lloyd were not always consistent. The default option described here was only introduced into either software sources in August 2014. Thus, two legacy 'equal' methods are included in this function, so users can emulate older ordering algorithms for 'equal' which are now deprecated, as they do not match the underlying logic of the original 'equal' algorithm and do not minimize down-passes when adjusting branch lengths on the time-scaled tree.

The root age can be adjusted backwards in time by either increasing by an arbitrary amount (via the vartime argument) or by setting the root age directly (via the node.mins argument); conversely, the function will also allow a user to opt to not alter the root age at all.


Exactly like 'equal' above, except that edges are ordered instead by their depth (i.e. number of nodes from the root). This minor modified version was referred to as 'equal' for this timePaleoPhy function in paleotree until February 2014, and thus is included here solely for legacy purposes. This ordering algorithm does not minimize branch adjustment cycles, like the newer default offered under currently 'equal'.


Exactly like 'equal' above, except that edges are ordered relative to their time (ie. total edge length) from the root following the application of the 'basic' time-scaling method, exactly as in G. Lloyd's original application. This was the method for sorting edges in the "equal" algorithm in G. Lloyd's date.phylo script and DatePhylo in package strap until August 2014, and was the default "equal" algorithm in paleotree's timePaleoPhy function from February 2014 until August 2014. This ordering algorithm does not minimize branch adjustment cycles, like the newer default offered under currently 'equal'. Due to how the presence of zero-length branches can make ordering branches based on time to be very unpredictable, this version of the 'equal' algorithm is highly not recommended.


All branches additive. This method takes the "basic" tree and adds vartime to all branches. Note that this time-scaling method can warp the tree structure, leading to tips to originate out of order with the appearance data used.


Zero-length branches additive. This method adds vartime to all zero-length branches in the "basic" tree. Discussed (possibly?) by Hunt and Carrano, 2010.Note that this time-scaling method can warp the tree structure, leading to tips to originate out of order with the appearance data used.


Minimum branch length. Scales all branches so they are greater than or equal to vartime, and subtract time added to later branches from earlier branches in order to maintain the temporal structure of events. A version of this was first introduced by Laurin (2004).

These functions cannot time-scale branches relative to reconstructed character changes along branches, as used by Lloyd et al. (2012). Please see DatePhylo in R package strap for this functionality.

These functions will intuitively drop taxa from the tree with NA for range or are missing from timeData or timeList. Taxa dropped from the tree will be will be listed in a message output to the user. The same is done for taxa in the timeList object not listed in the tree.

As with many functions in the paleotree library, absolute time is always decreasing, i.e. the present day is zero.

As of August 2014, please note that the branch-ordering algorithm used in 'equal' has changed to match the current algorithm used by DatePhylo in package strap, and that two legacy versions of 'equal' have been added to this function, respectively representing how timePaleoPhy and DatePhylo (and its predecessor date.phylo) applied the 'equal' time-scaling method.

Interpretation of Taxon Ages in timePaleoPhy

timePaleoPhy is primarily designed for direct application to datasets where taxon first and last appearances are precisely known in continuous time, with no stratigraphic uncertainty. This is an uncommon form of data to have from the fossil record, although not an impossible form (micropaleontologists often have very precise range charts, for example). Instead, most data has some form of stratigraphic uncertainty. However, for some groups, the more typical 'first' and 'last' dates found in the literature or in databases represent the minimum and maximum absolute ages for the fossil collections that a taxon is known is known from. Presumably, the first and last appearances of that taxon in the fossil record is at unknown dates within these bounds.

As of paleotree version 2.0. the treatment of taxon ages in timePaleoPhy is handled by the argument dateTreatment. By default, this argument is set to 'firstLast' which means the matrix of ages are treated as precise first and last appearance dates (i.e. FADs and LADs). The earlier FADs will be used to calibrate the node ages, which could produce fairly nonsensical results if these are 'minimum' ages instead and reflect age uncertainty. Alternatively, dateTreatment can be set to 'minMax' which instead treats taxon age data as minimum and maximum bounds on a single point date. These point dates, if the minimum and maximum bounds option is selected, are chose under a uniform distribution. Many time-scaled trees should be created to approximate the uncertainty in the dates. Additionally, there is a third option for dateTreatment: users may also make it so that the 'times of observation' of trees are uncertain, such that the tips of the tree (with terminal ranges added) should be randomly selected from a uniform distribution. Essentially, this third option treats the dates as first and last appearances, but treats the first appearance dates as known and fixed, but the 'last appearance' dates as unknown. In previous versions of paleotree, this third option was enacted with the argument rand.obs, which has been removed for clarity.

Interpretation of Taxon Ages in bin_timePaleoPhy

As an alternative to using timePaleoPhy, bin_timePaleoPhy is a wrapper of timePaleoPhy which produces time-scaled trees for datasets which only have interval data available. For each output tree, taxon first and last appearance dates are placed within their listed intervals under a uniform distribution. Thus, a large sample of time-scaled trees will approximate the uncertainty in the actual timing of the FADs and LADs. In some ways, treating taxonomic age uncertainty may be more logical via bin_timePaleoPhy, as it is tied to specific interval bounds, and there are more options available for certain types of age uncertainty, such as for cases where specimens come from the same fossil site.

The input timeList object for bin_timePaleoPhy can have overlapping (i.e. non-sequential) intervals, and intervals of uneven size. Taxa alive in the modern should be listed as last occurring in a time interval that begins at time 0 and ends at time 0. If taxa occur only in single collections (i.e. their first and last appearance in the fossil record is synchronous, the argument point.occur will force all taxa to have instantaneous durations in the fossil record. Otherwise, by default, taxa are assumed to first and last appear in the fossil record at different points in time, with some positive duration. The sites matrix can be used to force only a portion of taxa to have simultaneous first and last appearances.

By setting the argument nonstoch.bin to TRUE for bin_timePaleoPhy, the dates are NOT stochastically pulled from uniform bins but instead FADs are assigned to the earliest time of whichever interval they were placed in and LADs are placed at the most recent time in their placed interval. This option may be useful for plotting. The sites argument becomes arbitrary if nonstoch.bin = TRUE.

If timeData or the elements of timeList are actually data.frames (as output by read.csv or read.table), these will be coerced to a matrix.


A tutorial for applying the time-scaling functions in paleotree, along with an example using real (graptolite) data, can be found here: http://nemagraptus.blogspot.com/2013/06/a-tutorial-to-cal3-time-scaling-using.html


The output of these functions is a time-scaled tree or set of time-scaled trees, of either class phylo or multiphylo, depending on the argument ntrees. All trees are output with an element $root.time. This is the time of the root on the tree and is important for comparing patterns across trees. Note that the $root.time element is defined relative to the earliest first appearance date, and thus later tips may seem to occur in the distant future under the 'aba' and 'zbla' time-scaling methods.

Trees created with bin_timePaleoPhy will output with some additional elements, in particular $ranges.used, a matrix which records the continuous-time ranges generated for time-scaling each tree. (Essentially a pseudo-timeData matrix.)


Please account for stratigraphic uncertainty in your analysis. Unless you have exceptionally resolved data, select an appropriate option in dateTreatment within timePaleoPhy, use the more sophisticated bin_timePaleoPhy or code your own wrapper function of timePaleoPhy that accounts for stratigraphic uncertainty in your dataset.


David W. Bapst, heavily inspired by code supplied by Graeme Lloyd and Gene Hunt.


Bapst, D. W. 2013. A stochastic rate-calibrated method for time-scaling phylogenies of fossil taxa. Methods in Ecology and Evolution. 4(8):724-733.

Bapst, D. W. 2014. Assessing the effect of time-scaling methods on phylogeny-based analyses in the fossil record. Paleobiology 40(3):331-351.

Brusatte, S. L., M. J. Benton, M. Ruta, and G. T. Lloyd. 2008 Superiority, Competition, and Opportunism in the Evolutionary Radiation of Dinosaurs. Science 321(5895):1485-91488.

Hunt, G., and M. T. Carrano. 2010 Models and methods for analyzing phenotypic evolution in lineages and clades. In J. Alroy, and G. Hunt, eds. Short Course on Quantitative Methods in Paleobiology. Paleontological Society.

Laurin, M. 2004. The Evolution of Body Size, Cope's Rule and the Origin of Amniotes. Systematic Biology 53(4):594-622.

Lloyd, G. T., S. C. Wang, and S. L. Brusatte. 2012 Identifying Heterogeneity in Rates of Morphological Evolution: Discrete Character Change in the Evolution of Lungfish(Sarcopterygii, Dipnoi). Evolution 66(2):330–348.

Smith, A. B. 1994 Systematics and the fossil record: documenting evolutionary patterns. Blackwell Scientific, Oxford.

See Also

cal3TimePaleoPhy, binTimeData, multi2di

For an alternative time-scaling function, which includes the 'ruta' method that weights the time-scaling of branches by estimates of character change along with implementations of the 'basic' and 'equal' methods described here, please see function DatePhylo in package strap.


# examples with empirical data

#load data

#Can plot the unscaled cladogram
#Can plot discrete time interval diversity curve with retioRanges

#Use basic time-scaling (terminal branches only go to FADs)
ttree <- bin_timePaleoPhy(tree = retioTree,timeList = retioRanges,type = "basic",
	ntrees = 1, plot = TRUE)

#Use basic time-scaling (terminal branches go to LADs)
ttree <- bin_timePaleoPhy(tree = retioTree,timeList = retioRanges,type = "basic",
	add.term = TRUE, ntrees = 1, plot = TRUE)

#mininum branch length time-scaling (terminal branches only go to FADs)
ttree <- bin_timePaleoPhy(tree = retioTree,timeList = retioRanges,type = "mbl",
	vartime = 1, ntrees = 1, plot = TRUE)


# examples with simulated data

#Simulate some fossil ranges with simFossilRecord
record <- simFossilRecord(p = 0.1, q = 0.1, nruns = 1,
nTotalTaxa = c(30,40), nExtant = 0)
taxa <- fossilRecord2fossilTaxa(record)
#simulate a fossil record with imperfect sampling with sampleRanges
rangesCont <- sampleRanges(taxa,r = 0.5)
#let's use taxa2cladogram to get the 'ideal' cladogram of the taxa
cladogram <- taxa2cladogram(taxa,plot = TRUE)
#Now let's try timePaleoPhy using the continuous range data
ttree <- timePaleoPhy(cladogram,rangesCont,type = "basic",plot = TRUE)
#plot diversity curve 

#that tree lacked the terminal parts of ranges (tips stops at the taxon FADs)
#let's add those terminal ranges back on with add.term
ttree <- timePaleoPhy(cladogram,rangesCont,type = "basic",add.term = TRUE,plot = TRUE)
#plot diversity curve 

#that tree didn't look very resolved, does it? (See Wagner and Erwin 1995 to see why)
#can randomly resolve trees using the argument randres
#each resulting tree will have polytomies randomly resolved in different ways using multi2di
ttree <- timePaleoPhy(cladogram,rangesCont,type = "basic",ntrees = 1,randres = TRUE,
    add.term = TRUE,plot = TRUE)
#notice well the warning it prints!
#we would need to set ntrees to a large number to get a fair sample of trees

#if we set ntrees>1, timePaleoPhy will make multiple time-trees
ttrees <- timePaleoPhy(cladogram,rangesCont,type = "basic",ntrees = 9,randres = TRUE,
    add.term = TRUE,plot = TRUE)
#let's compare nine of them at once in a plot
parOrig <- par(no.readonly = TRUE)
par(mar = c(1,1,1,1))
for(i in 1:9){plot(ladderize(ttrees[[i]]),show.tip.label = FALSE,no.margin = TRUE)}
#they are all a bit different!

#we can also resolve the polytomies in the tree according to time of first appearance
	#via the function timeLadderTree, by setting the argument 'timeres' to TRUE
ttree <- timePaleoPhy(cladogram,rangesCont,type = "basic",ntrees = 1,timeres = TRUE,
    add.term = TRUE,plot = TRUE)

#can plot the median diversity curve with multiDiv

#compare different methods of timePaleoPhy
parOrig <- par(no.readonly = TRUE)
par(mar = c(3,2,1,2))
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type = "basic",vartime = NULL,add.term = TRUE)))
    axisPhylo();text(x = 50,y = 23,"type = basic",adj = c(0,0.5),cex = 1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type = "equal",vartime = 10,add.term = TRUE)))
    axisPhylo();text(x = 55,y = 23,"type = equal",adj = c(0,0.5),cex = 1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type = "aba",vartime = 1,add.term = TRUE)))
    axisPhylo();text(x = 55,y = 23,"type = aba",adj = c(0,0.5),cex = 1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type = "zlba",vartime = 1,add.term = TRUE)))
    axisPhylo();text(x = 55,y = 23,"type = zlba",adj = c(0,0.5),cex = 1.2)
plot(ladderize(timePaleoPhy(cladogram,rangesCont,type = "mbl",vartime = 1,add.term = TRUE)))
    axisPhylo();text(x = 55,y = 23,"type = mbl",adj = c(0,0.5),cex = 1.2)

#using node.mins
#let's say we have (molecular??) evidence that node #5 is at least 1200 time-units ago
#to use node.mins, first need to drop any unshared taxa
droppers <- cladogram$tip.label[is.na(
cladoDrop <- drop.tip(cladogram, droppers)
# now make vector same length as number of nodes
nodeDates <- rep(NA, Nnode(cladoDrop))
nodeDates[5] <- 1200
ttree1 <- timePaleoPhy(cladoDrop,rangesCont,type = "basic",
  	randres = FALSE,node.mins = nodeDates,plot = TRUE)
ttree2 <- timePaleoPhy(cladoDrop,rangesCont,type = "basic",
   	randres = TRUE,node.mins = nodeDates,plot = TRUE)

#Using bin_timePaleoPhy to time-scale with discrete interval data
#first let's use binTimeData() to bin in intervals of 1 time unit
rangesDisc <- binTimeData(rangesCont,int.length = 1)
ttreeB1 <- bin_timePaleoPhy(cladogram,rangesDisc,type = "basic",ntrees = 1,randres = TRUE,
    add.term = TRUE,plot = FALSE)
#notice the warning it prints!
#with time-order resolving via timeLadderTree
ttreeB2 <- bin_timePaleoPhy(cladogram,rangesDisc,type = "basic",ntrees = 1,timeres = TRUE,
    add.term = TRUE,plot = FALSE)
#can also force the appearance timings not to be chosen stochastically
ttreeB3 <- bin_timePaleoPhy(cladogram,rangesDisc,type = "basic",ntrees = 1,
    nonstoch.bin = TRUE,randres = TRUE,add.term = TRUE,plot = FALSE)

# testing node.mins in bin_timePaleoPhy
ttree <- bin_timePaleoPhy(cladoDrop,rangesDisc,type = "basic",ntrees = 1,
    add.term = TRUE,randres = FALSE,node.mins = nodeDates,plot = TRUE)
# with randres = TRUE
ttree <- bin_timePaleoPhy(cladoDrop,rangesDisc,type = "basic",ntrees = 1,
    add.term = TRUE,randres = TRUE,node.mins = nodeDates,plot = TRUE)

#simple three taxon example for testing inc.term.adj
ranges1 <- cbind(c(3,4,5),c(2,3,1));rownames(ranges1) <- paste("t",1:3,sep = "")
clado1 <- read.tree(file = NA,text = "(t1,(t2,t3));")
ttree1 <- timePaleoPhy(clado1,ranges1,type = "mbl",vartime = 1)
ttree2 <- timePaleoPhy(clado1,ranges1,type = "mbl",vartime = 1,add.term = TRUE)
ttree3 <- timePaleoPhy(clado1,ranges1,type = "mbl",vartime = 1,add.term = TRUE,inc.term.adj = TRUE)

paleotree documentation built on Oct. 2, 2018, 5:04 p.m.