Ancestor-Descendant Relationships for Macroperforate Foraminifera, from Aze et al. (2011)

Share:

Description

An example dataset of ancestor-descendent relationships and first and last appearance dates for a set of macroperforate Foramanifera, taken from the supplemental materials of Aze et al. (2011). This dataset is included here primarily for testing functions parentChild2taxonTree and taxa2phylo.

Format

The 'foramAM' and 'foramAL' tables include budding taxon units for morphospecies and lineages respective, with four columns: taxon name, ancestral taxon's name, first appearance date and last appearance date (note that column headings vary). The 'foramAMb' and 'foramALb' tables are composed of data for the same taxon units as the previous branching events are split so that the relationships are fully 'bifurcating', rather than 'budding'. As this obscures taxonomic identity, taxon identification labels are included in an additional, fifth column in these tables. See the examples section for more details.

Details

This example dataset is composed of four tables, each containing information on the ancestor-descendant relationships and first and last appearances of species of macroperforate foraminifera species from the fossil record. Each of the four tables are for the same set of taxa, but divide and concatanate the included foram species in four different ways, relating to the use of morpospecies versus combined anagenetic lineages (see Ezard et al., 2012), and whether taxa are retained as units related by budding-cladogensis or the splitting of taxa at branching points to create a fully 'bifurcating' set of relationships, independent of ancestral morphotaxon persistance through branching events. See the examples section for more details.

Source

This dataset is obtained from the supplementary materials of, specifically 'Appendix S5':

Aze, T., T. H. G. Ezard, A. Purvis, H. K. Coxall, D. R. M. Stewart, B. S. Wade, and P. N. Pearson. 2011. A phylogeny of Cenozoic macroperforate planktonic foraminifera from fossil data. Biological Reviews 86(4):900-927.

References

This dataset has been used or referenced in a number of works, including:

Aze, T., T. H. G. Ezard, A. Purvis, H. K. Coxall, D. R. M. Stewart, B. S. Wade, and P. N. Pearson. 2013. Identifying anagenesis and cladogenesis in the fossil record. Proceedings of the National Academy of Sciences 110(32):E2946-E2946.

Ezard, T. H. G., T. Aze, P. N. Pearson, and A. Purvis. 2011. Interplay Between Changing Climate and Species' Ecology Drives Macroevolutionary Dynamics. Science 332(6027):349-351.

Ezard, T. H. G., P. N. Pearson, T. Aze, and A. Purvis. 2012. The meaning of birth and death (in macroevolutionary birth-death models). Biology Letters 8(1):139-142.

Ezard, T. H. G., G. H. Thomas, and A. Purvis. 2013. Inclusion of a near-complete fossil record reveals speciation-related molecular evolution. Methods in Ecology and Evolution 4(8):745-753.

Strotz, L. C., and A. P. Allen. 2013. Assessing the role of cladogenesis in macroevolution by integrating fossil and molecular evidence. Proceedings of the National Academy of Sciences 110(8):2904-2909.

Strotz, L. C., and A. P. Allen. 2013. Reply to Aze et al.: Distinguishing speciation modes based on multiple lines of evidence. Proceedings of the National Academy of Sciences 110(32):E2947-E2947.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
# Following Text Reproduced from Aze et al. 2011's Supplemental Material
# Appendix S5
# 
# 'Data required to produce all of the phylogenies included in the manuscript
# using paleoPhylo (Ezard & Purvis, 2009) a free software package to draw
# paleobiological phylogenies in R.'
#
# 'The four tabs hold different versions of our phylogeny:
#	 aMb: fully bifurcating morphospecies phylogeny
#	 aM: budding/bifurcating morphospecies phylogeny
#	 aLb: fully bifurcating lineage phylogeny
#	 aL: budding/bifurcating lineage phylogeny
#
# 'Start Date gives the first occurence of the species according
# to the particular phylogeny; End Date gives the last occurence
# according to the particular phylogeny.'

## Not run: 

#load the data 
	#given in supplemental as XLS sheets
	#converted to separate tab-deliminated text files

# aM: budding/bifurcating morphospecies phylogeny
foramAM<-read.table(file.choose(),stringsAsFactors=FALSE,header=TRUE)
# aL: budding/bifurcating lineage phylogeny
foramAL<-read.table(file.choose(),stringsAsFactors=FALSE,header=TRUE)
# aMb: fully bifurcating morphospecies phylogeny
foramAMb<-read.table(file.choose(),stringsAsFactors=FALSE,header=TRUE)
# aLb: fully bifurcating lineage phylogeny
foramALb<-read.table(file.choose(),stringsAsFactors=FALSE,header=TRUE)

save.image("macroperforateForam.rdata")


## End(Not run)

#instead, we'll just load the data directly
data(macroperforateForam)

#Two distinctions among the four datasets:
#(1): morphospecies vs morphospecies combined into sequences of anagenetic
	# morpospecies referred to as 'lineages'. Thus far more morphospecies
	# than lineages. The names of lineages are given as the sequence of
	# their respective component morphospecies.
#(2): Datasets where taxon units (morphospecies or lineages) are broken up
	# at 'budding' branching events (where the ancestral taxon persists)
	# so that final dataset is 'fully bifurcating', presumably
	# to make comparison easier to extant-taxon only datasets.
	# (This isn't a limitation for paleotree, though!).
	# This division of taxon units requires abstracting the taxon IDs,
	# requiring another column for Species Name.

dim(foramAM)
dim(foramAL)
dim(foramAMb)
dim(foramALb)

#Need to convert these to same format as fossilRecord2fossilTaxa output.
	#those 'taxa' tables has 6 columns:
	#taxon.id ancestor.id orig.time ext.time still.alive looks.like

#for the purposes of this, we'll make taxon.id=looks.like
	# (That's only for simulating cryptic speciation anyway)
#still.alive should be TRUE (1) if ext.time=0

#a function to convert Aze et al's suppmat to paleotree-readable format

createTaxaData<-function(table){
	#reorder table by first appearance time
	table<-table[order(-as.numeric(table[,3])),]
	ID<-1:nrow(table)
	anc<-sapply(table[,2],function(x)
		if(!is.na(x)){
			which(x==table[,1])
		}else{ NA })
	stillAlive<-as.numeric(table[,4]==0)
	ages<-cbind(as.numeric(table[,3]),as.numeric(table[,4]))
	res<-cbind(ID,anc,ages,stillAlive,ID)
	colnames(res)<-c('taxon.id','ancestor.id','orig.time',
		'ext.time','still.alive','looks.like')
	rownames(res)<-table[,1]
	return(res)
	}

taxaAM<-createTaxaData(foramAM)
taxaAMb<-createTaxaData(foramAMb)
taxaAL<-createTaxaData(foramAL)
taxaALb<-createTaxaData(foramALb)

##################################

#Checking Ancestor-Descendant Relationships for Irregularities

#For each of these, there should only be a single taxon
	# without a parent listed (essentially, the root ancestor)

countParentsWithoutMatch<-function(table){
    	parentMatch<-match(unique(table[,2]),table[,1])
    	sum(is.na(parentMatch))
	}

#test this on the provided ancestor-descendant relationships
countParentsWithoutMatch(foramAM)
countParentsWithoutMatch(foramAL)
countParentsWithoutMatch(foramAMb)
countParentsWithoutMatch(foramALb)

#and on the converted datasets
countParentsWithoutMatch(taxaAM)
countParentsWithoutMatch(taxaAL)
countParentsWithoutMatch(taxaAMb)
countParentsWithoutMatch(taxaALb)

 

#can construct the parentChild2taxonTree
	#using the ancestor-descendant relationships 

#can be very slow...

treeAM<-parentChild2taxonTree(foramAM[,2:1])
treeAL<-parentChild2taxonTree(foramAL[,2:1])
treeAMb<-parentChild2taxonTree(foramAMb[,2:1])
treeALb<-parentChild2taxonTree(foramALb[,2:1])

layout(matrix(1:4,2,2))
plot(treeAM,main='treeAM',show.tip.label=FALSE)
plot(treeAL,main='treeAL',show.tip.label=FALSE)
plot(treeAMb,main='treeAMb',show.tip.label=FALSE)
plot(treeALb,main='treeALb',show.tip.label=FALSE)

# FYI 
# in case you were wondering
# you would *not* time-scale these Frankenstein monsters



###########################################

# Checking stratigraphic ranges

# do all first occurrence dates occur before last occurrence dates?
	# we'll check the original datasets here

checkFoLo<-function(data){
	diffDate<-data[,3]-data[,4]	#subtract LO from FO
	isGood<-all(diffDate>=0)	#is it good
	return(isGood)
	}

checkFoLo(foramAM)
checkFoLo(foramAL)
checkFoLo(foramAMb)
checkFoLo(foramALb)

#cool, but do all ancestors appear before their descendents?
	# easier to check unified fossilRecord2fossilTaxa format here

checkAncOrder<-function(taxa){
	#get ancestor's first occurrence
	ancFO<-taxa[taxa[,2],3]
	#get descendant's first occurrence	
	descFO<-taxa[,3]
	diffDate<-ancFO-descFO	#subtract descFO from ancFO
	#remove NAs due to root taxon
	diffDate<-diffDate[!is.na(diffDate)]
	isGood<-all(diffDate>=0)	#is it all good	
	return(isGood)
	}

checkAncOrder(taxaAM)
checkAncOrder(taxaAL)
checkAncOrder(taxaAMb)
checkAncOrder(taxaALb)

#now, are there gaps between the last occurrence of ancestors
	# and the first occurrence of descendents?
	# (shall we call these 'stratophenetic ghost branches'?!)
	# These shouldn't be problematic, but do they occur in this data?
# After all, fossilRecord2fossilTaxa output tables are designed for
	   # fully observed simulated fossil records with no gaps.

sumAncDescGap<-function(taxa){
	#get ancestor's last occurrence
	ancLO<-taxa[taxa[,2],4]
	#get descendant's first occurrence	
	descFO<-taxa[,3]
	diffDate<-ancLO-descFO	#subtract descFO from ancFO
	#remove NAs due to root taxon
	diffDate<-diffDate[!is.na(diffDate)]
	#should be negative or zero, positive values are gaps
	gaps<-c(0,diffDate[diffDate>0])
	sumGap<-sum(gaps)
	return(sumGap)
	}

#get the total gap between ancestor LO and child FO
sumAncDescGap(taxaAM)
sumAncDescGap(taxaAL)
sumAncDescGap(taxaAMb)
sumAncDescGap(taxaALb)

#It appears there is *no* gaps between ancestors and their descendants
	#in the Aze et al. foram dataset... wow!

###############

 

# Creating time-scaled phylogenies from the Aze et al. data

# Aze et al. (2011) defines anagenesis such that taxa may overlap
# in time during a transitional period (see Ezard et al. 2012
# for discussion of this definition). Thus, we would expect that
# paleotree obtains very different trees for morphospecies versus
# lineages, but very similar phylogenies for datasets where budding
# taxa are retained or arbitrarily broken into bifurcating units.

# We can use the function taxa2phylo to directly create
# time-scaled phylogenies from the Aze et al. stratophenetic data

timetreeAM<-taxa2phylo(taxaAM)
timetreeAL<-taxa2phylo(taxaAL)
timetreeAMb<-taxa2phylo(taxaAMb)
timetreeALb<-taxa2phylo(taxaALb)

layout(matrix(1:4,2,2))
plot(timetreeAM,main='timetreeAM',show.tip.label=FALSE)
axisPhylo()
plot(timetreeAL,main='timetreeAL',show.tip.label=FALSE)
axisPhylo()
plot(timetreeAMb,main='timetreeAMb',show.tip.label=FALSE)
axisPhylo()
plot(timetreeALb,main='timetreeALb',show.tip.label=FALSE)
axisPhylo()

#visually compare the two pairs we expect to be close to identical

#morpospecies
layout(1:2)
plot(timetreeAM,main='timetreeAM',show.tip.label=FALSE)
axisPhylo()
plot(timetreeAMb,main='timetreeAMb',show.tip.label=FALSE)
axisPhylo()

#lineages
layout(1:2)
plot(timetreeAL,main='timetreeAL',show.tip.label=FALSE)
axisPhylo()
plot(timetreeALb,main='timetreeALb',show.tip.label=FALSE)
axisPhylo()

layout(1)

#compare the summary statistics of the trees
Ntip(timetreeAM)
Ntip(timetreeAMb)
Ntip(timetreeAL)
Ntip(timetreeALb)
# very different!

# after dropping anagenetic zero-length-terminal-edge ancestors
# we would expect morphospecies and lineage phylogenies to be very similar

#morphospecies
Ntip(dropZLB(timetreeAM))
Ntip(dropZLB(timetreeAMb))
#identical!

#lineages
Ntip(dropZLB(timetreeAL))
Ntip(dropZLB(timetreeALb))
# ah, very close, off by a single tip
# ...probably a very short ZLB outside tolerance

#we can create some diversity plots to compare

multiDiv(data=list(timetreeAM,timetreeAMb),
	plotMultCurves=TRUE)

multiDiv(data=list(timetreeAL,timetreeALb),
	plotMultCurves=TRUE)

# we can see that the morphospecies datasets are identical
	# that's why we can only see one line
# some very slight disagreement between the lineage datasets
	# around ~30-20 Ma

#can also compare morphospecies and lineages diversity curves

multiDiv(data=list(timetreeAM,timetreeAL),
	plotMultCurves=TRUE)

#they are similar, but some peaks are missing from lineages
	# particularly around ~20-10 Ma

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.