Description Usage Arguments Details Value Warning Author(s) References See Also Examples

This function computes and returns the set of all optimal but equivalent edit transcripts that transforms one sequences into another at minimum cost, as well as the corresponding aligned sequences, or, alternatively a combined edit graph.

1 2 3 |

`x,y` |
a numeric or string vector. |

`method` |
a mnemonic string referencing a distance measure. |

`weight` |
vector or matrix of parameter values. |

`exclude` |
argument to factor. |

`graph` |
option to compute the combined edit graph. |

`partial` |
option to compute an approximate substring match. |

Function `sdists.trace`

complements the distance computation between
sequences by `sdists`

. So, please, see the details of
`method`

, `weight`

, and `exclude`

there. However, note the
following differences: 1) you can supply only two sequences, either as
vectors of numeric symbol codes, factors, or as strings, i.e. scalar
vectors of type `character`

. 2) you can supply a weight matrix with
the rownames and colnames representing the symbol sets of the first and
second sequence. For instance, this allows you to align a sequence with
the profile of a multiple alignment. 3) if `method = "ow"`

the
space symbol `""`

is included in the factor levels so that you can
conveniently replace `NA`

in the aligned sequences.

A transcript uses the character codes `I`

, `D`

, `R`

, and
`M`

, for insert, delete, replace, and match operations, which
transform the first into the second sequence. Thus, conceptually a symbol
has to be inserted into the first, deleted from the second, replaced in the
first sequence, or matched in both, to obtain the second sequence. However,
in the aligned sequences you will see `NA`

, where an insert or delete
would take place, indicating space.

In the case of a local alignment different symbols are used for the
prefix and/or suffix of the alignment: `i`

, `d`

, and `?`

for insert, delete, and replace or match operations. However, note that
their sole purpose is to obtain a common representation of the two
sequences. Finally, only alignments of maximal length are reported.

The time complexity of finding a transcript is *O(n+m)* for two
sequences of length n and m, respectively *O(n*m)* for the local
alignment problem. However, note that the runtime for generating all
transcripts can be *O((n*m)^3)* in the worst case.

If `partial = FALSE`

computes an approximate substring match of
`x`

(the pattern) in `y`

, for `method = "ow"`

only.
Returns the subset of paths which require the maximum number of match
and initial and final insert operations.

A list with components each a list of two factors, the aligned sequences.
The names of the components are the edit transcripts, and the attribute
`value`

contains the minimum cost, i.e. the distance (or negative
similarity).

If `graph = TRUE`

a vector of edit transcripts is returned with
attributes `value`

, `table`

, `pointer`

, and `graph`

.
The second contains the values of the dynamic programming table and the
third a list of vectors `x0, y0, x1, y1`

representing the
(back)pointers. Similarly, the fourth attribute is a list of vectors
`x0, y0, x1, y1, weight`

representing the edge set of all optimal
paths. That is, each tuple contains the `from`

and `to`

coordinates as used by `segments`

, each representing a pair of
indexes into the first and second sequence, and the number of times an
edge occurs on a path. Note that the origin of the coordinate system
(0,0) corresponds to the element of `table`

indexed by
(`""`

,`""`

),
where `""`

indicates the space symbol. Thus, if used as subscripts
the coordinates have to be offset by one.

The interface is experimental and may change in the future

Christian Buchta

D. Gusfield (1997). *Algorithms on Strings, Trees, and Sequences*.
Cambridge University Press, Chapter 11.

`sdists`

for computation of distances between sequences,
`segments`

for plotting of edge sets,
`plot.sdists.graph`

for visualizing alignments.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ```
### from the book
x1 <- "vintner"
y1 <- "writers"
b1 <- sdists.trace(x1, y1, weight=c(1,1,0,1))
b1
## longest common subsequence ?
sdists.trace("a","b", weight=c(0,0,-1,0))
## from the book
w2 <- matrix(-2,ncol=13,nrow=13)
w2[1,] <- w2[,1] <- -1
diag(w2) <- c(0,rep(2,12))
x2 <- "pqraxabcstvq"
y2 <- "xyaxbacsll"
colnames(w2) <- c("",unique(strsplit(paste(x2, y2, sep = ""),"")[[1]]))
b2 <- sdists.trace(x2, y2, method="awl", weight=w2)
b2
## alignment with different symbol sets
x3 <- "121314"
y3 <- "ABACAD"
w3 <- matrix(-1,nrow=5,ncol=5)
diag(w3) <- 0
rownames(w3) <- c("","1","2","3","4")
colnames(w3) <- c("","A","B","C","D")
b3 <- sdists.trace(x3, y3, method="aw", weight=w3)
b3
## partial
b4 <- sdists.trace(x1, y1, weight=c(1,1,0,1), partial = TRUE)
b4
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.