This function converts occurrence data, given as a list where each element is a different taxon's occurrence table (containing minimum and maximum ages for each occurrence), to the 'timeList' format, consisting of a list composed of a matrix of lower and upper age bounds for intervals, and a second matrix recording the interval in which taxa first and last occur in the given dataset.

1 | ```
occData2timeList(occList, intervalType = "dateRange")
``` |

`occList` |
A list where every element is a table of occurrence data for a different taxon,
such as that returned by |

`intervalType` |
Must be either "dateRange" (the default), "occRange" or "zoneOverlap". Please see details below. |

This function should translate taxon-sorted occurrence data, which could be Paleobiology Database
datasets sorted by `taxonSortPBDBocc`

or any data object where occurrence data
(i.e. age bounds for each occurrence) for different taxa is separated into different elements
of a named list.

The argument `intervalType`

controls the algorithm used for obtain first and last interval bounds for
each taxon, of which there are several to select from:intervalType

- "dateRange"
The default option. The bounds on the first appearances are the span between the oldest upper and lower bounds of the occurrences, and the bounds on the last appearances are the span between the youngest upper and lower bounds across all occurrences. This is guaranteed to provide the smallest bounds on the first and last appearances, and was originally suggested to the author by J. Marcot.

- "occRange"
This option returns the smallest bounds among (a) the oldest occurrences for the first appearance (i.e. all occurrences with their lowest bound at the oldest lower age bound), and (b) the youngest occurrences for the last appearance (i.e. all occurrences with their uppermost bound at the youngest upper age bound).

- "zoneOverlap"
This option is an attempt to mimic the stratigraphic range algorithm used by PBDB Classic which "finds the oldest base that is older than at least part of all the intervals and the youngest that is younger than at least part of all the intervals" (pers.comm., J. Alroy). This is a somewhat more complex case as we are trying to obtain a

`timeList`

object. So, for calculating the bounds of the first interval a taxon occurs in, the`zoneOverlap`

algorithm looks for all occurrences that overlap with the age range of the earliest-most occurrence and (1) obtains their earliest boundary ages and returns the latest-most earliest age boundary among these overlapping occurrences and (2) obtains their latest boundary ages and returns the earliest-most latest age boundary among these overlapping occurrences. Similarly, for calculating the bound of the last interval a taxon occurs in, the`zoneOverlap`

algorithm looks for all occurrences that overlap with the age range of the latest-most occurrence and (1) obtains their earliest boundary ages and returns the latest-most earliest age boundary among these overlapping occurrences and (2) obtains their latest boundary ages and returns the earliest-most latest age boundary among these overlapping occurrences.On theoretical grounds, one could probably describe the zone-of-overlap algorithm as minimizing taxonomic age ranges by assuming that all overlapping occurrences at the start and end of a taxon's range probably describe a very similar first and last appearance (FADs and LADs), and thus picks the occurrence with bounds that extends the taxonomic range the least. However, this does come with a downside that if these occurrences are not essentially repeated attempts to capture the same FAD or LAD, then the zone-of-overlap algorithm isn't an accurate depiction of the uncertainty in the ages. The true biological range of a taxon might be well outside the bounds obtained using the zone-of-overlap algorithm. A more conservative approach is the

`"dateRange"`

algorithm which finds the smallest possible bounds on the endpoints of a taxon's range without ignoring uncertainty from any particular set of occurrences.

Returns a standard timeList data object, as used by many other paleotree functions, like
`bin_timePaleoPhy`

, `bin_cal3TimePaleoPhy`

and `taxicDivDisc`

David W. Bapst, with the 'dateRange' algorithm suggested by Jon Marcot.

`taxonSortPBDBocc`

, `plotOccData`

and the
example graptolite dataset at `graptPBDB`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | ```
data(graptPBDB)
graptOccSpecies<-taxonSortPBDBocc(graptOccPBDB,rank="species",onlyFormal=FALSE)
graptTimeSpecies<-occData2timeList(occList=graptOccSpecies)
head(graptTimeSpecies[[1]])
head(graptTimeSpecies[[2]])
graptOccGenus<-taxonSortPBDBocc(graptOccPBDB,rank="genus",onlyFormal=FALSE)
graptTimeGenus<-occData2timeList(occList=graptOccGenus)
layout(1:2)
taxicDivDisc(graptTimeSpecies)
taxicDivDisc(graptTimeGenus)
# the default interval calculation is "dateRange"
# let's compare to the other option, "occRange"
# for species
graptOccRange<-occData2timeList(occList=graptOccSpecies, intervalType="occRange")
#we would expect no change in the diversity curve
#because there are only changes in th
#earliest bound for the FAD
#latest bound for the LAD
#so if we are depicting ranges within maximal bounds
#dateRanges has no effect
layout(1:2)
taxicDivDisc(graptTimeSpecies)
taxicDivDisc(graptOccRange)
#yep, identical
#so how much uncertainty was gained by using dateRange?
# write a simple function for getting uncertainty in first and last
# appearance dates from a timeList object
sumAgeUncert<-function(timeList){
fourDate<-timeList2fourDate(timeList)
perOcc<-(fourDate[,1]-fourDate[,2])+(fourDate[,3]-fourDate[,4])
sum(perOcc)
}
#total amount of uncertainty in occRange dataset
sumAgeUncert(graptOccRange)
#total amount of uncertainty in dateRange dataset
sumAgeUncert(graptTimeSpecies)
#the difference
sumAgeUncert(graptOccRange)-sumAgeUncert(graptTimeSpecies)
#as a proportion
1-(sumAgeUncert(graptTimeSpecies)/sumAgeUncert(graptOccRange))
#a different way of doing it
dateChange<-timeList2fourDate(graptTimeSpecies)-timeList2fourDate(graptOccRange)
apply(dateChange,2,sum)
#total amount of uncertainty removed by dateRange algorithm
sum(abs(dateChange))
layout(1)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.