library('Inapp')
library('ape')
ap <- c("Absent", "Present")
rb <- c("Red", "Blue")
redBlue <- TreeTools::brewer[[2]]
orb <- c('', rb)

yesCol <- "#7fbf7be0"
noCol <- "#fc8d5960"

setPar <- function (mfrow=c(1, 2)) par(mfrow=mfrow, mar=c(0.2, 0.2, 1, 0.2), oma=c(0,0,0,0), cex=0.7)

Problems with the Fitch algorithm {#problems}

The Fitch algorithm [@Fitch1971] counts changes in a character. It assumes that the character is applicable throughout the tree. This assumption does not lead to error if:

Red tails, blue tails

Maddison [-@Maddison1993] provided the following example to demonstrate the problem encountered by the Fitch algorithm when inapplicable characters were present.

Consider the following tree, each node of which is supported by a number of characters. Tail colour (illustrated; 0 = red, 1 = blue) has not yet been considered, but has the potential to resolve the polytomy on the left hand side (bold).

par(mar=rep(0.2, 4))
treeUnresolved <- ape:::read.tree(text="(((((A, B, C, D), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
treeUnresolved$edge.length <- rep(1, dim(treeUnresolved$edge)[1])
edgeWidths <- rep(2, dim(treeUnresolved$edge)[1])
edgeWidths[5:8] <- 4
plot(treeUnresolved, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width = edgeWidths)
c0 <- TreeTools::brewer[[2]][1]
c1 <- TreeTools::brewer[[2]][2]
cX <- "#cccccc"
tiplabels(c(0, 1, 0, 1, rep('-', 6), 1, 1, 0, 0), bg=c(c0, c1, c0, c1, rep(cX, 6), c1, c1, c0, c0))

In the bold region, tail colour should group the red-tailed tips together, and the blue-tailed tips together, but does not establish whether the ancestor of the left-hand tail-bearing clade had a red or blue tail.

setPar(c(2, 3))
pair <- ape::read.tree(text="(((a, b), (c, d)));")
walk <- ape::read.tree(text="((a, (b, (c, d))));")
c0011<- c(0, 0, 1, 1)
c0101 <- c(0, 1, 0, 1)
pair$edge.length <- walk$edge.length <- rep(1, 8)

plot(pair, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width=2, main = "Good")
tiplabels(c0011, bg=redBlue[c0011 + 1])
plot(walk, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width=2, main='Equally good')
tiplabels(c0011, bg=redBlue[c0011 + 1])
plot(walk, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width=2, main='Equally good')
tiplabels(rev(c0011), bg=redBlue[2 - c0011])
plot(pair, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width=2, main = "Bad")
tiplabels(c0101, bg=redBlue[c0101 + 1])
plot(walk, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width=2, main='Equally Bad')
tiplabels(c0101, bg=redBlue[c0101 + 1])
plot(walk, direction='upwards', use.edge.length=TRUE, show.tip.label=FALSE,
     edge.width=2, main='Equally Bad')
tiplabels(rev(c0101), bg=redBlue[2 - c0101])

Reductive coding

Under reductive coding, the tail and its colour are described in two character statements:

Tail: (0), absent; (1), present.

Tail, colour: (0), red; (1), blue; (?), inapplicable.

Consider the following two trees, each of which receives a score of two for the first character (presence of tail). The score of the second character (tail colour) is not as desired.

The Fitch algorithm will prefer trees in which the left-hand tail-bearing clade has a blue tail, simply because the right-hand tail-bearing clade ancestrally did.

par(mar=rep(0.2, 4), mfrow=c(2, 1), cex=0.8)
treeA <- ape::read.tree(text="(((((((A, B), C), D), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
treeB <- ape::read.tree(text="(((((A, (B, (C, D))), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
vignettePlot(treeA, '0011------1100', na=FALSE, state.labels=rb)     
vignettePlot(treeB, '0011------1100', na=FALSE, state.labels=rb)

Notice the additional step reconstructed at the root node: the Fitch algorithm reconstructs a change in tail colour in a taxon that doesn't have a tail!

This reconstruction is not logically consistent.

An exception

If the parent character can parsimoniously be reconstructed as present at every internal node in a single unbroken region of a tree, and nowhere else, then reductive coding does work successfully. Reductive coding may therefore be appropriate if only a subset of all possible trees are under consideration, and is always (i.e. for all trees) appropriate if a character exhibits fewer than three inapplicable tokens.

Inapplicable as an extra state

An alternative is to code the inapplicable token as an extra state:

Tail: (0), absent; (1), present.

Tail, colour: (0), red; (1), blue; (2), inapplicable.

This seems to resolve the problem case that we encountered with reductive coding:

par(mar=rep(0.2, 4), mfrow=c(2, 1), cex=0.8)
treeA <- ape::read.tree(text="(((((((A, B), C), D), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
treeB <- ape::read.tree(text="(((((A, (B, (C, D))), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
vignettePlot(treeA, '00112222221100', na=FALSE, state.labels=c(rb, 'Inapplicable'))     
vignettePlot(treeB, '00112222221100', na=FALSE, state.labels=c(rb, 'Inapplicable'))     

Both trees now receive the same score for the 'tail colour' character, which contributes four steps. Two of these steps, however, correspond to steps that have already been counted in the parent character, reflecting the two gains of a tail.

Although this reconstruction is logically consistent, the gain (or loss) of the tail is now reflected in two characters -- characters are not independent of one another.

The outcome is that each ontologically dependent character serves to increase the weight of its parent character. The loss of a tail, for example, would incur a cost of one step in the tail character and one step in each ontologically dependent character, even though it represents a single evolutionary event.

A single multi-state character

A different approach is to use a single character to denote both the presence and the colour of the tail:

Tail: (0), absent; (1), present, red; (2), present, blue.

This seems to resolve the problem case that we encountered with reductive coding:

par(mar=rep(0.2, 4), mfrow=c(2, 1), cex=0.8)
treeA <- ape::read.tree(text="(((((((A, B), C), D), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
treeB <- ape::read.tree(text="(((((A, (B, (C, D))), e), f), g), (h, (i, (j, (K, (L, (M, N)))))));")
vignettePlot(treeA, '11220000002211', na=FALSE, state.labels=c('Absent', 'Present, red', 'Present, blue'))     
vignettePlot(treeB, '11220000002211', na=FALSE, state.labels=c('Absent', 'Present, red', 'Present, blue'))     

However, we now have a situation where the gain/loss of a tail is afforded the same weight as a change in tail colour. We ought to prefer a tree where the tail evolved once (and changed colour) to one where it evolved twice (being a different colour each time).

par(mar=rep(0.2, 4), mfrow=c(2, 1), cex=0.8)
simpleTree <- ape::read.tree(text="((((a, b), c), e), (f, g));")
vignettePlot(simpleTree, "110022", na=FALSE)
nodelabels("+ Tail\n", c(7, 9), pos=3, bg=NA, frame='no', font=2)
text(4, 4.1, "Two gains of tail", cex=1.2)

vignettePlot(simpleTree, "112200", na=FALSE)
nodelabels("+ Tail\n", 8, pos=3, bg=NA, frame='no', font=2)
text(4, 4.1, "One gain of tail (preferable)", cex=1.2)

Sankoff matrices

It would be possible to establish a Sankoff matrix [@Sankoff1975;@Sankoff1983] such that a change between absent and red or absent and blue cost more than a change between red and blue, but this effectively up-weights the tail character, and it's not clear that this is desirable -- or how much this extra weight should be [@Maddison1993].

Symmetric

Consider a character with three ontologically dependent characters:

Presence: Absent / present

Colour: Red / blue

Covering: Scaly / hairy

Shape: Straight / curly

This could be coded as a single transformation series using a Sankoff matrix:

dots <- rep('.', 3)
state <- c(0:3, dots, 8)
costs <- matrix(c(
  c(0, 4, 4, 4, dots, 4),
  c(4, 0, 1, 1, dots, 3),
  c(4, 1, 0, 2, dots, 2),
  c(4, 1, 2, 0, dots, 2),
  c(rep('.', 4), 0, rep('.', 3)),
  c(rep('.', 5), 0, rep('.', 2)),
  c(rep('.', 6), 0, rep('.', 1)),
  c(4, 3, 2, 2, dots, 0)) , byrow=TRUE, ncol=length(state))
costNames <- c("(0), absent", 
                     "(1), present, red, scaly, straight",
                     "(2), present, red, scaly, curly",
                     "(3), present, red, hairy, straight",
                     dots,
                     "(8), present, blue, hairy, curly")
sankoffCaption <- "Tail: Cost to go from left state to top state:"
colnames(costs) <-  state
rownames(costs) <- costNames
knitr::kable(costs, caption=sankoffCaption)

The first thing to note is that each additional ontologically depedent character generates disproportionately more complexity in the Sankoff matrix.

Even if this additional complexity could be handled, the underlying issue remains that losing a tail, which arguably corresponds to a single evolutionary event, is allocated a large cost (here, 4) that grows in line with the number of ontogenetically dependant characters.

Gain and loss asymmetric

At the cost of symmetry, one could argue that the loss of a tail requires a single transformation, whereas the gain requires the addition of a tail and the "setting" of each ontologically dependent character, rendering an asymmetric Sankoff matrix that nevertheless respects triangular inequality [@Wheeler1993]:

costs <- matrix(c(
  c(0, 4, 4, 4, dots, 4),
  c(1, 0, 1, 1, dots, 3),
  c(1, 1, 0, 2, dots, 2),
  c(1, 1, 2, 0, dots, 2),
  c(rep('.', 4), 0, rep('.', 3)),
  c(rep('.', 5), 0, rep('.', 2)),
  c(rep('.', 6), 0, rep('.', 1)),
  c(1, 3, 2, 2, dots, 0)) , byrow=TRUE, ncol=length(state))
colnames(costs) <-  c('(0)', '(1)', '(2)', '(3)', '.', '.', '.', '(8)')
rownames(costs) <- costNames
knitr::kable(costs, caption=sankoffCaption)

Here, though, we encounter a new problem: reconstructions involving very many losses are preferred to those involving a single gain.

setPar(c(2, 1))
par(mar=c(0.2, 0.2, 4.2, 0.2))
sankTree <- ape::read.tree(text="(((((a, b), c), dX), eX), (fX, (gX, (p, (q, r)))));")
#vignettePlot(sankTree, '1110000888', na=FALSE, legend='none', blankNodes=TRUE)
sankTree$edge.length <- rep(1, dim(sankTree$edge)[1])
c1 <- TreeTools::brewer[[3]][1]
c0 <- TreeTools::brewer[[3]][2]
c8 <- TreeTools::brewer[[3]][3]
plot(sankTree, show.tip.label=FALSE, direction='upwards', edge.width=2,
     edge.col= c(c0, c0, c1, c1, c1, c1, c1, rep(c0, 6), rep(c8, 5)))
tiplabels(c(1,1,1,0,0,0,0,8,8,8), bg=c(rep(c1, 3),
                                       rep(c0, 4),
                                       rep(c8, 3)))
legend('bottomleft', bty='n', bg = NULL, pch=15, col="#7fbf7be0", pt.cex=1.8,
      legend = "Two gains: cost = 8") 
nodelabels(c("0->1 (+4)", "0->8 (+4)"), node=c(13, 17), bg="#7fbf7b00", pos=3, frame='no', col="#7fbf7be0", font=2)

plot(sankTree, show.tip.label=FALSE, direction='upwards', edge.width=2,
     edge.col= c(rep(c1, 7), c0, c0, c8, c0, c8, c0, rep(c8, 5)))
edgelabels()
tiplabels(c(1,1,1,0,0,0,0,8,8,8), bg=c(rep(c1, 3),
                                       rep(c0, 4),
                                       rep(c8, 3)))
legend('bottomleft', bty='n', bg = NULL, pch=15, col="#af8dc3e0", pt.cex=1.8,
      legend = "Four losses: cost = 7") 
nodelabels(c("1->8 (+3)", rep("1->0 (+1)",  2), rep("8->0 (+1)", 2)), node=c(11:13, 16:17), 
           bg="#af8dc300", pos=3, frame='no', col="#af8dc3e0", font=2)

Why counting steps cannot work

The failure of the Sankoff approach illustrates a more general problem: if the only thing that is counted is the number of steps, then trees that imply multiple gains and losses of a principal character are not adequately penalised.

To illustrate this point, consider counting only transitions between applicable states (i.e. steps), but not transitions from the applicable state to the inapplicable state:

setPar(c(1,1))
tree <- ape::read.tree(text="((((a, b), c), d), (e, (f, ((g, h), (i, j)))));")
vignettePlot(tree, '11----1122', passes=1:2, legend.pos='none',
            state.labels=orb, counts=2)

nodelabels(node=c(13, 16), pch=15, col=noCol, cex=7)

legend('bottomleft', bty='n',
       c('State change: +1 step', 'Change from inapplicable to applicable: +0 steps'),
       pch=15, pt.cex=1.5, col=c(yesCol, noCol))

The number of steps can be minimized by maximizing the number of independent gains of a parent character.

comb15 <- ape::read.tree(text="(((((a, c), d), e), f), (g, (i, (j, (v, (w, (x, (y, z))))))));")
comb5i <- ape::read.tree(text="(((((vv, c), d), yy), f), (g, (xx, (j, (k, (ww, (m, (n, zz))))))));")

char <- '0000000011111'
sub1 <- '--------12222'
sub2 <- '--------11222'
sub3 <- '--------11122'
sub4 <- '--------11122'
sub5 <- '--------11112'

setPar(c(1, 2))
Compare <- function (char.states, char.label, states.labels, top=FALSE) {
    vignettePlot(comb15, char.states, na=!top, 
                 counts=2, passes=4 - (2 * top),
                 legend.pos='none', state.labels=states.labels)
    legend('topleft', char.label, bty='n')
    legend('bottomleft', pch=15, pt.cex=1.5, col=yesCol, bty='n', "+1 step")
    if (top) title('Single gain of tail (total score = 6)')

    vignettePlot(comb5i, char.states, na=!top, 
                 counts=2,  passes=4 - (2 * top),
                 legend.pos='none', state.labels=states.labels)
    if (!top)  legend('bottomleft', pch=1 , pt.cex=1.5, col=yesCol, bty='n', "+0 steps") 
    if (top) legend('bottomleft', pch=15, pt.cex=1.5, col=yesCol, bty='n', "+5 steps") 
    if (top) title('Five gains of tail (total score = 5)')

}

Compare(char, 'Tail, presence', ap, TRUE)
Compare(sub1, 'Tail, colour', orb)
Compare(sub2, 'Tail, covering', c("", "Scaly", "Hairy"))
Compare(sub3, 'Tail, shape', c("", "Straight", "Curly"))
Compare(sub4, 'Tail, margin', c("", "Smooth", "Serrated"))
Compare(sub5, 'Tail, poison barbs', c("", "Absent", "Present"))

Conclusion

No coding mechanism can generate consistent and logically meaningful tree scores when employing the Fitch algorithm. A new algorithm is needed: one that counts homoplasies instead of steps.



TGuillerme/Inapp documentation built on Feb. 4, 2024, 7:26 a.m.