MapCharacters: Map Changes in Ancestral Character States

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/MapCharacters.R

Description

Maps character changes on a phylogenetic tree containing reconstructed ancestral states.

Usage

1
2
3
4
5
6
MapCharacters(x,
              refPositions = seq_len(nchar(attr(x, "state")[1])),
              labelEdges = FALSE,
              type = "dendrogram",
              ignoreAmbiguity = TRUE,
              ignoreIndels = TRUE)

Arguments

x

An object of class dendrogram with "state" attributes for each node.

refPositions

Numeric vector of reference positions in the original sequence alignment. Only changes at refPositions are reported, and state changes are labeled according to their position in refPositions.

labelEdges

Logical determining whether to label edges with the number of changes along each edge.

type

Character string indicating the type of output desired. This should be (an abbreviation of) one of "dendrogram", "table", or "both". (See value section below.)

ignoreAmbiguity

Logical specifying whether to report changes involving ambiguities. If TRUE (the default), changes involving "N" and other ambiguity codes are ignored.

ignoreIndels

Logical specifying whether to report insertions and deletions (indels). If TRUE (the default), only substitutions of one state with another are reported.

Details

Ancestral state reconstruction affords the ability to identify character changes that occurred along edges of a rooted phylogenetic tree. Character changes are reported according to their index in refPositions. If ignoreIndels is FALSE, adjacent insertions and deletions are merged into single changes occurring at their first position. The table of changes can be used to identify parallel, convergent, and divergent mutations.

Value

If type is "dendrogram" (the default) then the original dendrogram x is returned with the addition of "change" attributes on every edge except the root. If type is "table" then a sorted table of character changes is returned with the most frequent parallel changes at the beginning. If type is "both" then a list of length 2 is provided containing both the dendrogram and table.

Author(s)

Erik Wright eswright@pitt.edu

See Also

IdClusters

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
fas <- system.file("extdata", "Bacteria_175seqs.fas", package="DECIPHER")
dna <- readDNAStringSet(fas)

# align the sequences
rna <- RNAStringSet(RemoveGaps(dna))
rna <- AlignSeqs(rna)
rna # input alignment

d <- DistanceMatrix(rna, type="dist", correction="JC")
tree <- IdClusters(d,
                   method="NJ",
                   type="dendrogram",
                   myXStringSet=rna,
                   reconstruct=TRUE)

out <- MapCharacters(tree, labelEdges=TRUE, type="both")

# plot the tree with defaults
tree <- out[[1]]
plot(tree, horiz=TRUE) # edges show number of changes

# color edges by number of changes
maxC <- 200 # changes at maximum of color spectrum
colors <- colorRampPalette(c("black", "darkgreen", "green"))(maxC)
colorEdges <- function(x) {
   num <- attr(x, "edgetext") + 1
   if (length(num)==0)
       return(x)
   if (num > maxC)
       num <- maxC
   attr(x, "edgePar") <- list(col=colors[num])
   attr(x, "edgetext") <- NULL
   return(x)
}
colorfulTree <- dendrapply(tree, colorEdges)
plot(colorfulTree, horiz=TRUE, leaflab="none")

# look at parallel changes (X->Y)
parallel <- out[[2]]
head(parallel) # parallel changes

# look at convergent changes (*->Y)
convergent <- gsub(".*?([0-9]+.*)", "\\1", names(parallel))
convergent <- tapply(parallel, convergent, sum)
convergent <- sort(convergent, decreasing=TRUE)
head(convergent)

# look at divergent changes (X->*)
divergent <- gsub("(.*[0-9]+).*", "\\1", names(parallel))
divergent <- tapply(parallel, divergent, sum)
divergent <- sort(divergent, decreasing=TRUE)
head(divergent)

# plot number of changes by position
changes <- gsub(".*?([0-9]+).*", "\\1", names(parallel))
changes <- tapply(parallel, changes, sum)
plot(as.numeric(names(changes)),
     changes,
     xlab="Position",
     ylab="Total independent changes")

# count cases of potential compensatory mutations
compensatory <- dendrapply(tree,
    function(x) {
        change <- attr(x, "change")
        pos <- as.numeric(gsub(".*?([0-9]+).*", "\\1", change))
        e <- expand.grid(seq_along(pos), seq_along(pos))
        e <- e[pos[e[, 1]] < pos[e[, 2]],]
        list(paste(change[e[, 1]], change[e[, 2]], sep=" & "))
    })
compensatory <- unlist(compensatory)
u <- unique(compensatory)
m <- match(compensatory, u)
m <- tabulate(m, length(u))
compensatory <- sort(setNames(m, u), decreasing=TRUE)
head(compensatory) # ranked list of concurrent mutations

DECIPHER documentation built on Nov. 8, 2020, 8:30 p.m.