Construct a Stochastic Sequenced Time-List from an Unsequenced Time-List
This function randomly samples from a timeList object (i.e. a list composed of a matrix of interval start and end dates and a matrix of taxon first and last intervals), to find a set of taxa and intervals that do not overlap, output as a new timeList object.
A list composed of two matrices, giving interval start and end dates and taxon first and last occurrences within those intervals. Some intervals are expected to overlap (thus necessitating the use of this function), and datasets lacking overlapping intervals will return an error message.
Number of new timeList composed of non-overlapping intervals produced.
If TRUE, weight sampling of new intervals toward smaller intervals. FALSE by default.
Many analyses of diversification and sampling in the fossil record require a dataset composed of sequential non-overlappling intervals, but the nature of the geologic record often makes this difficult, with taxa from different regions, environments and sedimentary basins having first and last appearances placed in entirely in-congruent systems of chronostratigraphic intervals. While one option is to convert such occurrences to a single, global stratigraphic system, this may still result in overlapping intervals when fossil collections are poorly constrained stratigraphically. (For example, this may often be the case in global datasets.) This function offers an approach to avoid this issue in large datasets by randomly subsampling the available taxa and intervals to produce stochastic sets of ranges composed of data drawn from non-overlapping intervals.
This function is stochastic and thus should be set for many runs to produce many such solutions. Additionally, all solutions found are returned, and users may wish to sort amongst these to maximize the number of intervals and number of taxa returned. A single solution which maximizes returned taxa and intervals may not be a precise enough approach to estimating sampling rates, however, given the uncertainty in data. Thus, many runs should always be considered.
By default, solutions are searched for without consideration to the length of intervals used (i.e. the selection of intervals is 'unweighted').
Alternatively, we can 'weight' selection toward the smallest intervals in the set, using the argument
intervals presumably overlap less and thus should retain more taxa and intervals of more equal length. However, in practise with empirical datasets,
the package author finds these approaches do not seem to produce very different estimates.
For some datasets, many solutions found using seqTimeList may return infinite sampling values. This is often due to saving too many taxa found in single intervals to the exclusion of longer-ranging taxa (see the example). This excess of single interval taxa is a clear artifact of the randomized seqTimeList procedure and such solutions should probably be ignored.
A list, composed of three elements:
nIntervals which is a vector of the
number of intervals in each solution,
nTaxa which is a vector of the number of
taxa in each solution and
timeLists which is a list composed of each new
timeList object as an element.
David W. Bapst
Resulting time-lists can be analyzed with
binTimeData can be useful for simulating interval data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
#Simulate some fossil ranges with simFossilRecord set.seed(444) record<-simFossilRecord(p=0.1, q=0.1, nruns=1, nTotalTaxa=c(60,80), nExtant=0) taxa<-fossilRecord2fossilTaxa(record) #simulate a fossil record with imperfect sampling with sampleRanges() rangesCont <- sampleRanges(taxa,r=0.1) #Now let's use binTimeData to get ranges in discrete overlapping intervals #via pre-set intervals input presetIntervals <- cbind(c(1000,995,990,980,970,975,960,950,940,930,900,890,888,879,875), c(995,989,960,975,960,950,930,930,930,900,895,888,880,875,870)) rangesDisc1 <- binTimeData(rangesCont,int.times=presetIntervals) seqLists<-seqTimeList(rangesDisc1,nruns=10) seqLists$nTaxa seqLists$nIntervals #apply freqRat as an example analysis sapply(seqLists$timeLists,freqRat) #notice the zero and infinite freqRat estimates? What's going on? freqRat(seqLists$timeLists[],plot=TRUE) #too few taxa of two or three interval durations for the ratio to work properly #perhaps ignore these estimates #with weighted selection of intervals seqLists<-seqTimeList(rangesDisc1,nruns=10,weightSampling=TRUE) seqLists$nTaxa seqLists$nIntervals sapply(seqLists$timeLists,freqRat) #didn't have much effect in this simulated example