library('knitr'); library('Inapp') setPar <- function (mfrow=c(1, 2)) par(mfrow=mfrow, mar=c(0.2, 0.2, 1, 0.2), oma=c(0,0,0,0), cex=0.7) ap <- c("Absent", "Present") rb <- c("Red", "Blue")
Ambiguous data does not pose a problem for the algorithm, but the nature of the ambiguity must be considered when scoring a character.
If it's not clear whether or not a taxon has a tail, then tail colour should be
coded as ?
, denoting that any possible token (including the inapplicable token)
may be the most parsimonious for the tail.
In trees in which the tail can be reconstructed as present, the ambiguous tip will be reconstructed as having a tail of the appropriate colour:
setPar() hasTail <- ape::read.tree(text="((a, (b, (c, d))), (e, ((f, (g, X)), (h, i))));") lacksTail <- ape::read.tree(text="((a, (b, (X, (c, d)))), (e, ((f, g), (h, i))));") tail <- "000001111?" col <- "-----1122?" vignettePlot(hasTail, tail, legend.pos='topleft', na=FALSE, main="Tail", state.labels = ap) vignettePlot(hasTail, col, legend.pos='topleft', main="Tail colour", state.labels = orb)
In trees in which the tail cannot be reconstructed as present without inferring a homoplasious origin, the tail colour will be reconstructed as inapplicable:
setPar() vignettePlot(lacksTail, tail, legend.pos='topleft', na=FALSE, main="Tail", state.labels = ap) vignettePlot(lacksTail, col, legend.pos='topleft', main="Tail colour", state.labels = orb)
If a taxon is known to have a tail, there are two scenarios for ontologically dependent transformational characters:
If the subordinate character must take one of a finite set of values, then the unobserved property of the tail is known to belong to these values and should be coded accordingly.
For example:
Tail: (0), absent; (1), present
Tail margin: (1), smooth; (2), serrated.
Assume that the tail margin must either be smooth or serrated, and there is no
reason to assume that either state is ancestral (i.e. the character is strictly
transformational). Tail margin should then be coded as {12}
: i.e. the tail
is known to have taken one of the two states 1 or 2.
A more complicated situation arises where a subordinate character may have unobserved states, as with
Tail colour: (1), red; (2), blue.
A taxon that is known to have a tail, but whose tail colour is uncertain, should
generally be coded as ?
.
Coding it as {12}
would be appropriate if the tail was known with preternatural
certainty to be homologous with other tails in the dataset, in which case
it would be most parsimonious to assume that the tail colour is the same colour
as the ancestor of the tip, which was necessarily either red or blue.
setPar(c(1, 1)) vignettePlot(ape::read.tree(text="(a, (b, (c, (d, (e, (f, (g, (h, (i1, i2)))))))));"), '---11122{12}{12}', legend.pos='topleft', state.labels = orb, passes=4)
But if, as will more often be the case, homology of the tails is not known a priori, then it is possible that this taxon has a tail that is not homologous with any other tail whose colour has been observed.
In this case, coding the tail colour as {12}
denotes that the tail is the same
colour as a tail that has already been observed. This means that the independent
origin of the tail also represents an independent
origin of this particular colour -- and hence an instance of homoplasy.
setPar(c(1, 1)) vignettePlot(ape::read.tree(text="((a, (i1, i2)), (b, (c, (d, (e, (f, (g, h)))))));"), '---11122{12}{12}', legend.pos='topleft', state.labels = orb, passes=4)
Coding the tail colour as ?
allows the possibility that the independently-evolved
tail has a different colour to the tails already observed -- green, perhaps.
Reconstructing the tail colour as a colour that has not already been observed
avoids an instance of homoplasy, and is therefore more parsimonious.
In the case that the unknown tail evolved independently and was green, the original character formulation -- which only provides tokens for red and blue tails -- cannot be applied and is thus inapplicable. Our algorithm will thus reconstruct tail colour as being inapplicable in such a taxon.
setPar(c(1, 1)) vignettePlot(ape::read.tree(text="((a, (i1, i2)), (b, (c, (d, (e, (f, (g, h)))))));"), '---11122??', legend.pos='topleft', state.labels = orb, passes=4)
We therefore recommend the following coding schema for ambiguous tips where the tail is known to be present, ambiguous, or known to be absent:
mDep <- matrix(c("1", "?", "0", "{01}", "?", "-", "?", "?", "-"), byrow=TRUE, ncol=3) colnames(mDep) <- c('Present', 'Unknown', 'Absent') rownames(mDep) <- c("Tail: (0), absent; (1), present.", "Tail margin: (1), smooth; (2), serrated.", "Tail colour: (1), red; (2), blue.") knitr::kable(mDep, caption="Recommended coding for unknown contingent characters")
"Tail margin" represents a character that can only take the states observed (smooth or serrated), whereas tail colour represents a character that may take an unobserved state (e.g. green).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.