timePaleoPhy | R Documentation |
Dates an unscaled cladogram of fossil taxa using information on their
temporal ranges, using various methods. Also can resolve polytomies randomly
and output samples of randomly-resolved trees. As simple methods of dating ('time-scaling')
phylogenies of fossil taxa can have biasing effects on macroevolutionary analyses
(Bapst, 2014, Paleobiology), this function is largely retained for legacy purposes
and plotting applications. The methods implemented
by the functions listed here do not return realistic estimates of
divergence dates, and users are strongly encouraged to investigate other
methods such as cal3TimePaleoPhy
or createMrBayesTipDatingNexus
.
timePaleoPhy( tree, timeData, type = "basic", vartime = NULL, ntrees = 1, randres = FALSE, timeres = FALSE, add.term = FALSE, inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL, noisyDrop = TRUE, plot = FALSE ) bin_timePaleoPhy( tree, timeList, type = "basic", vartime = NULL, ntrees = 1, nonstoch.bin = FALSE, randres = FALSE, timeres = FALSE, sites = NULL, point.occur = FALSE, add.term = FALSE, inc.term.adj = FALSE, dateTreatment = "firstLast", node.mins = NULL, noisyDrop = TRUE, plot = FALSE )
tree |
An unscaled cladogram of fossil taxa, of class |
timeData |
Two-column matrix of first and last occurrences in absolute
continuous time, with row names as the taxon IDs used on the tree. This means the
first column is very precise FADs (first appearance dates) and the second
column is very precise LADs (last appearance dates), reflect the precise points
in time when taxa first and last appear. If there is stratigraphic uncertainty in
when taxa appear in the fossil record, it is preferable to use the |
type |
Type of time-scaling method used. Can be |
vartime |
Time variable; usage depends on the |
ntrees |
Number of dated trees to output.
Only applicable is there is some stochastic (random) element to the analysis.
If |
randres |
Should polytomies be randomly resolved? By default,
|
timeres |
Should polytomies be resolved relative to the order of
appearance of lineages? By default, |
add.term |
If |
inc.term.adj |
If |
dateTreatment |
This argument controls the interpretation of A second option is A third option is With both arguments |
node.mins |
The minimum dates of internal nodes (clades) on a phylogeny can be set
using |
noisyDrop |
If |
plot |
If |
timeList |
A list composed of two matrices giving interval times and
taxon appearance dates. The rownames of the second matrix should be the taxon IDs,
identical to the |
nonstoch.bin |
If |
sites |
Optional two column matrix, composed of site IDs for taxon FADs
and LADs. The sites argument allows users to constrain the placement of
dates by restricting multiple fossil taxa whose FADs or LADs are from the
same very temporally restricted sites (such as fossil-rich Lagerstatten) to
always have the same date, across many iterations of time-scaled trees. To
do this, provide a |
point.occur |
If true, will automatically produce a 'sites' matrix which forces all FADs and LADs to equal each other. This should be used when all taxa are only known from single 'point occurrences', i.e. each is only recovered from a single bed/horizon, such as a Lagerstatten. |
Simplistic 'a posteriori' Dating (aka 'Time-Scaling') Methods for Paleontology Phylogenies
These functions are an attempt to unify and collect previously used and discussed a posteriori methods for time-scaling phylogenies of fossil taxa. Unfortunately, it can be difficult to attribute some time-scaling methods to specific references in the literature.
There are five main a posteriori approaches that
can be used by timePaleoPhy
. Four of these
main types use some value of absolute time, chosen a priori, to date the tree.
This is handled by the argument vartime
, which is NULL
by default and unused
for type "basic"
.
This most simple of time-scaling methods ignores vartime
and
scales nodes so they are as old as the first appearance of their oldest
descendant (Smith, 1994). This method produces many zero-length branches
(Hunt and Carrano, 2010).
The "equal"
method defined by G. Lloyd and used in Brusatte
et al. (2008) and Lloyd et al. (2012). Originally usable in code supplied by
G. Lloyd, the "equal"
algorithm is recreated here as closely as possible. This method
works by increasing the time of the root divergence by some amount and then
adjusting zero-length branches so that time on early branches is re-apportioned
out along those later branches equally. Branches are adjusted in order relative
to the number of nodes separating the edge from the root, going from the furthest
(most shallow) edges to the deepest edges.
The choice of ordering algorithm can have an unanticipated large effect on the
resulting dated trees created using "equal"
and it appears that
paleotree
and functions written by G. Lloyd were not always consistent.
The default option described here was only introduced in paleotree
and other available software sources in August 2014.
Thus, two legacy "equal"
methods are included in this function, so users can
emulate older ordering algorithms for "equal"
which are now deprecated, as they do not
match the underlying logic of the original "equal"
algorithm and do not minimize down-passes
when adjusting branch lengths on the time-scaled tree.
The root age can be adjusted backwards in time by either increasing by
an arbitrary amount (via the vartime
argument) or by setting the
root age directly (via the node.mins
argument); conversely, the
function will also allow a user to opt to not alter the root age at all.
Exactly like "equal"
above,
except that edges are ordered instead
by their depth (i.e. number of nodes from the root). This minor modified version
was referred to as "equal"
for this timePaleoPhy
function in paleotree
until February 2014, and thus is
included here solely for legacy purposes.
This ordering algorithm does not minimize branch adjustment cycles,
like the newer default offered under currently "equal"
.
Exactly like "equal"
above, except that edges are ordered relative
to their time (i.e., the total edge length) from the root following the application of the 'basic'
time-scaling method, exactly as in G. Lloyd's original application. This was the method for sorting
edges in the "equal"
algorithm in G. Lloyd's date.phylo
script and DatePhylo
in
package strap
until August 2014, and was the default
"equal"
algorithm in paleotree
's timePaleoPhy
function from February 2014 until August 2014.
This ordering algorithm does not minimize branch adjustment cycles,
like the newer default offered under currently "equal"
.
Due to how the presence of zero-length
branches can make ordering branches based on time to be very unpredictable,
this version of the "equal"
algorithm is highly not recommended.
All branches additive.
This method takes the "basic"
time-scaled tree and
adds vartime
to all branches.
Note that this time-scaling method can (and often will) warp the
tree structure, leading to tips to originate out of order with the appearance
data used.
Zero-length branches additive.
This method adds vartime
to all
zero-length branches in the "basic" tree.
Discussed (possibly?) by Hunt and Carrano, 2010.
Note that this time-scaling method can warp the tree structure,
leading to tips to originate out of order with the appearance data used.
Minimum branch length. Scales all branches so they are
greater than or equal to vartime
, and subtract time added to later branches
from earlier branches in order to maintain the temporal structure of events.
A version of this was first introduced by Laurin (2004).
These functions cannot time-scale branches relative to reconstructed
character changes along branches, as used by Lloyd et al. (2012). Please
see DatePhylo
in R package strap
for this functionality.
These functions will intuitively drop taxa from the tree with NA for range
or are missing from timeData
or timeList
. Taxa dropped from the tree will be
will be listed in a message output to the user. The same is done for taxa in
the timeList
object not listed in the tree.
As with many functions in the paleotree
library, absolute time is always
decreasing, i.e. the present day is zero.
As of August 2014, please note that the branch-ordering algorithm used in "equal"
has changed
to match the current algorithm used by DatePhylo
in package strap
, and that two legacy
versions of "equal"
have been added to this function, respectively representing how timePaleoPhy
and DatePhylo
(and its predecessor date.phylo
) applied the "equal"
time-scaling method.
Interpretation of Taxon Ages in timePaleoPhy
timePaleoPhy
is primarily designed for direct application to datasets where taxon first
and last appearances are precisely known in continuous time, with no stratigraphic
uncertainty. This is an uncommon form of data to have from the fossil record,
although not an impossible form (micropaleontologists often have very precise
range charts, for example).
Instead, most data has some form of stratigraphic uncertainty.
However, for some groups, the more typical 'first' and 'last' dates
found in the literature or in databases represent the minimum
and maximum absolute ages for the fossil collections that a taxon is known
is known from. Presumably, the first and last appearances of that taxon in
the fossil record is at unknown dates within these bounds.
As of paleotree v2.0. the treatment of taxon ages in
timePaleoPhy
is handled by the argument dateTreatment
.
By default, this argument is set to "firstLast"
which means the matrix of ages are treated
as precise first and last appearance dates (i.e. FADs and LADs). The earlier FADs will be used
to calibrate the node ages, which could produce fairly nonsensical results if these are 'minimum'
ages instead and reflect age uncertainty. Alternatively, dateTreatment
can be set to "minMax"
which instead treats taxon age data as minimum and maximum bounds on a single point date.
These point dates, if the minimum and maximum bounds option is selected,
are chose under a uniform distribution. Many dated trees should be generated, in order to approximate
the uncertainty in the dates. Additionally, there is a third option for dateTreatment
:
users may also make it so that the 'times of observation'
of trees are uncertain, such that the tips of the tree (with terminal ranges added) should
be randomly selected from a uniform distribution. Essentially, this third option treats the
dates as first and last appearances, but treats the first appearance dates as known and
fixed, but the 'last appearance' dates as unknown. In previous versions of paleotree,
this third option was enacted with the argument rand.obs
, which has been removed for
clarity.
Interpretation of Taxon Ages in bin_timePaleoPhy
As an alternative to using timePaleoPhy
, bin_timePaleoPhy
is a wrapper of
timePaleoPhy
which produces time-scaled trees for datasets which only have
interval data available. For each output tree, taxon first and last appearance
dates are placed within their listed intervals under a uniform distribution.
Thus, a large sample of dated trees will (hopefully) approximate the uncertainty in
the actual timing of the FADs and LADs. In some ways, treating taxonomic age uncertainty
may be more logical via bin_timePaleoPhy
, as it is tied to specific interval bounds,
and there are more options available for certain types of age uncertainty, such as for cases
where specimens come from the same fossil site.
The input timeList
object for bin_timePaleoPhy
can have overlapping
(i.e. non-sequential) intervals, and intervals of
uneven size. Taxa alive in the modern should be listed as last
occurring in a time interval that begins at time 0 and ends at time 0. If taxa
occur only in single collections (i.e. their first and last appearance in the
fossil record is synchronous, the argument point.occur
will force all taxa
to have instantaneous durations in the fossil record. Otherwise, by default,
taxa are assumed to first and last appear in the fossil record at different points
in time, with some positive duration. The sites
matrix can be used to force
only a portion of taxa to have simultaneous first and last appearances.
If timeData
or the elements of timeList
are actually data.frames
(as output
by read.csv
or read.table
), these will be coerced to a matrix.
Tutorial
A tutorial for applying the time-scaling functions in paleotree, along with an example using real (graptolite) data, can be found here: https://nemagraptus.blogspot.com/2013/06/a-tutorial-to-cal3-time-scaling-using.html
The output of these functions is a time-scaled tree or set of
time-scaled trees, of either class phylo
or multiphylo
, depending on the
argument ntrees
. All trees are output with an element $root.time. This is
the time of the root on the tree and is important for comparing patterns
across trees. Note that the $root.time element is defined relative to the
earliest first appearance date, and thus later tips may seem to occur in
the distant future under the "aba"
and "zbla"
time-scaling methods.
Trees created with bin_timePaleoPhy
will output with some additional
elements, in particular $ranges.used, a matrix which records the
continuous-time ranges generated for time-scaling each tree. (Essentially a
pseudo-timeData
matrix.)
Please account for stratigraphic uncertainty in your analysis.
Unless you have exceptionally resolved data, select an appropriate option in
dateTreatment
within timePaleoPhy
, use the more sophisticated
bin_timePaleoPhy
or code your own wrapper function of timePaleoPhy
that accounts for stratigraphic uncertainty in your dataset.
David W. Bapst, heavily inspired by code supplied by Graeme Lloyd and Gene Hunt.
Bapst, D. W. 2013. A stochastic rate-calibrated method for time-scaling phylogenies of fossil taxa. Methods in Ecology and Evolution. 4(8):724-733.
Bapst, D. W. 2014. Assessing the effect of time-scaling methods on phylogeny-based analyses in the fossil record. Paleobiology 40(3):331-351.
Brusatte, S. L., M. J. Benton, M. Ruta, and G. T. Lloyd. 2008 Superiority, Competition, and Opportunism in the Evolutionary Radiation of Dinosaurs. Science 321(5895):1485-91488.
Hunt, G., and M. T. Carrano. 2010 Models and methods for analyzing phenotypic evolution in lineages and clades. In J. Alroy, and G. Hunt, eds. Short Course on Quantitative Methods in Paleobiology. Paleontological Society.
Laurin, M. 2004. The Evolution of Body Size, Cope's Rule and the Origin of Amniotes. Systematic Biology 53(4):594-622.
Lloyd, G. T., S. C. Wang, and S. L. Brusatte. 2012 Identifying Heterogeneity in Rates of Morphological Evolution: Discrete Character Change in the Evolution of Lungfish(Sarcopterygii, Dipnoi). Evolution 66(2):330–348.
Smith, A. B. 1994 Systematics and the fossil record: documenting evolutionary patterns. Blackwell Scientific, Oxford.
cal3TimePaleoPhy
, binTimeData
,
multi2di
For an alternative time-scaling function, which includes the 'ruta'
method
that weights the time-scaling of branches by estimates of character change
along with implementations of the 'basic'
and "equal"
methods described here, please see function DatePhylo
in package strap
.
# examples with empirical data #load data data(retiolitinae) #Can plot the unscaled cladogram plot(retioTree) #Can plot discrete time interval diversity curve with retioRanges taxicDivDisc(retioRanges) #Use basic time-scaling (terminal branches only go to FADs) ttree <- bin_timePaleoPhy( tree = retioTree, timeList = retioRanges, type = "basic", ntrees = 1, plot = TRUE ) #Use basic time-scaling (terminal branches go to LADs) ttree <- bin_timePaleoPhy( tree = retioTree, timeList = retioRanges, type = "basic", add.term = TRUE, ntrees = 1, plot = TRUE ) #mininum branch length time-scaling (terminal branches only go to FADs) ttree <- bin_timePaleoPhy( tree = retioTree, timeList = retioRanges, type = "mbl", vartime = 1, ntrees = 1, plot = TRUE ) ################### # examples with simulated data # Simulate some fossil ranges with simFossilRecord set.seed(444) record <- simFossilRecord( p = 0.1, q = 0.1, nruns = 1, nTotalTaxa = c(30,40), nExtant = 0 ) taxa <- fossilRecord2fossilTaxa(record) #simulate a fossil record with imperfect sampling with sampleRanges rangesCont <- sampleRanges(taxa, r = 0.5) #let's use taxa2cladogram to get the 'ideal' cladogram of the taxa cladogram <- taxa2cladogram(taxa, plot = TRUE) #Now let's try timePaleoPhy using the continuous range data ttree <- timePaleoPhy( cladogram, rangesCont, type = "basic", plot = TRUE ) #plot diversity curve phyloDiv(ttree) ################################################ # that tree lacked the terminal parts of ranges # (tips stops at the taxon FADs) # let's add those terminal ranges back on with add.term ttree <- timePaleoPhy( cladogram, rangesCont, type = "basic", add.term = TRUE, plot = TRUE ) #plot diversity curve phyloDiv(ttree) ################################################# # that tree didn't look very resolved, does it? # (See Wagner and Erwin 1995 to see why) # can randomly resolve trees using the argument randres # each resulting tree will have polytomies # randomly resolved stochastically using ape::multi2di ttree <- timePaleoPhy( cladogram, rangesCont, type = "basic", ntrees = 1, randres = TRUE, add.term = TRUE, plot = TRUE ) # Notice the warning it prints! PAY ATTENTION! # We would need to set ntrees to a large number # to get a fair sample of trees # if we set ntrees > 1, timePaleoPhy will make multiple time-trees ttrees <- timePaleoPhy( cladogram, rangesCont, type = "basic", ntrees = 9, randres = TRUE, add.term = TRUE, plot = TRUE) #let's compare nine of them at once in a plot layout(matrix(1:9, 3, 3)) parOrig <- par(no.readonly = TRUE) par(mar = c(1, 1, 1, 1)) for(i in 1:9){ plot( ladderize(ttrees[[i]]), show.tip.label = FALSE, no.margin = TRUE ) } #they are all a bit different! ############################################## # we can also resolve the polytomies in the tree # according to time of first appearance via the function timeLadderTree # by setting the argument 'timeres = TRUE' ttree <- timePaleoPhy( cladogram, rangesCont, type = "basic", ntrees = 1, timeres = TRUE, add.term = TRUE, plot = TRUE ) #can plot the median diversity curve with multiDiv layout(1) par(parOrig) multiDiv(ttrees) #compare different methods of timePaleoPhy layout(matrix(1:6, 3, 2)) parOrig <- par(no.readonly = TRUE) par(mar = c(3, 2, 1, 2)) plot(ladderize(timePaleoPhy( cladogram, rangesCont, type = "basic", vartime = NULL, add.term = TRUE ))) axisPhylo() text(x = 50,y = 23, "type = basic", adj = c(0,0.5), cex = 1.2) # plot(ladderize(timePaleoPhy( cladogram, rangesCont, type = "equal", vartime = 10, add.term = TRUE ))) axisPhylo() text(x = 55,y = 23, "type = equal", adj = c(0,0.5), cex = 1.2) # plot( ladderize( timePaleoPhy( cladogram, rangesCont, type = "aba", vartime = 1, add.term = TRUE ) ) ) axisPhylo() text(x = 55,y = 23, "type = aba", adj = c(0,0.5), cex = 1.2) # plot( ladderize( timePaleoPhy( cladogram, rangesCont, type = "zlba", vartime = 1, add.term = TRUE ) ) ) axisPhylo() text(x = 55, y = 23, "type = zlba", adj = c(0,0.5), cex = 1.2 ) # plot( ladderize( timePaleoPhy( cladogram, rangesCont, type = "mbl", vartime = 1, add.term = TRUE ) ) ) axisPhylo() text(x = 55,y = 23, "type = mbl", adj = c(0,0.5), cex = 1.2 ) layout(1) par(parOrig) ############################################## #using node.mins #let's say we have (molecular??) evidence that # node #5 is at least 1200 time-units ago #to use node.mins, first need to drop any unshared taxa droppers <- cladogram$tip.label[is.na( match(cladogram$tip.label, names(which(!is.na(rangesCont[,1]))) ) )] cladoDrop <- drop.tip(cladogram, droppers) # now make vector same length as number of nodes nodeDates <- rep(NA, Nnode(cladoDrop)) nodeDates[5] <- 1200 ttree1 <- timePaleoPhy( cladoDrop,rangesCont, type = "basic", randres = FALSE, node.mins = nodeDates, plot = TRUE) ttree2 <- timePaleoPhy( cladoDrop, rangesCont, type = "basic", randres = TRUE, node.mins = nodeDates, plot = TRUE) #################################################### ################################################### #################################################### #Using bin_timePaleoPhy to time-scale with discrete interval data #first let's use binTimeData() to bin in intervals of 1 time unit rangesDisc <- binTimeData(rangesCont,int.length = 1) ttreeB1 <- bin_timePaleoPhy( cladogram, rangesDisc, type = "basic", ntrees = 1, randres = TRUE, add.term = TRUE, plot = FALSE ) #notice the warning it prints! phyloDiv(ttreeB1) #with time-order resolving via timeLadderTree ttreeB2 <- bin_timePaleoPhy( cladogram, rangesDisc, type = "basic", ntrees = 1, timeres = TRUE, add.term = TRUE, plot = FALSE ) phyloDiv(ttreeB2) #can also force the appearance timings not to be chosen stochastically ttreeB3 <- bin_timePaleoPhy( cladogram, rangesDisc, type = "basic", ntrees = 1, nonstoch.bin = TRUE, randres = TRUE, add.term = TRUE, plot = FALSE ) phyloDiv(ttreeB3) # testing node.mins in bin_timePaleoPhy ttree <- bin_timePaleoPhy( cladoDrop, rangesDisc, type = "basic", ntrees = 1, add.term = TRUE, randres = FALSE, node.mins = nodeDates, plot = TRUE ) # with randres = TRUE ttree <- bin_timePaleoPhy( cladoDrop, rangesDisc, type = "basic", ntrees = 1, add.term = TRUE, randres = TRUE, node.mins = nodeDates, plot = TRUE ) #simple three taxon example for testing inc.term.adj ranges1 <- cbind(c(3, 4, 5), c(2, 3, 1)) rownames(ranges1) <- paste("t", 1:3, sep = "") clado1 <- read.tree(file = NA, text = "(t1,(t2,t3));") ttree1 <- timePaleoPhy( clado1, ranges1, type = "mbl", vartime = 1 ) ttree2 <- timePaleoPhy( clado1, ranges1, type = "mbl", vartime = 1, add.term = TRUE ) ttree3 <- timePaleoPhy( clado1, ranges1, type = "mbl", vartime = 1, add.term = TRUE, inc.term.adj = TRUE ) # see differences in root times ttree1$root.time ttree2$root.time ttree3$root.time -apply(ranges1, 1, diff) layout(1:3) plot(ttree1) axisPhylo() plot(ttree2) axisPhylo() plot(ttree3) axisPhylo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.