The substitution-cost matrix is used when computing distances between sequences by the method of optimal matching. The function creates the substitution matrix using either a constant or the transition rates computed from the sequence data or other methods to be implemented in the future.

1 2 3 |

`seqdata` |
a sequence object as returned by the seqdef function. |

`method` |
method to compute transition rates. At this time, the methods available are constant
value ( |

`cval` |
the constant substitution cost if method |

`with.missing` |
if |

`miss.cost` |
the substitution cost for the missing state. The default set it to |

`time.varying` |
Logical. If |

`weighted` |
Logical. If |

`transition` |
Only used if |

`lag` |
Integer. Only used with ( |

`missing.trate` |
Logical. Only used with ( |

The substitution-cost matrix has dimension *ns*ns*, where
*ns* is the number of states in the alphabet of the
sequence object. The element *(i,j)* of the matrix is the cost of
substituting state *i* with state *j*.

With the `"CONSTANT"`

method, the substitution costs are the
same for all the states, with a default value of 2. An alternative
value can be provided by the user. When the `"TRATE"`

(transition rates) method is chosen, the transition rates between all
states are computed using the seqtrate function. The
substitution cost between states *i* and *j* is obtained with
the formula

*SC(i,j) = cval -P(i,j) -P(j,i)*

where *P(i,j)* is the transition rate from state *i* to
*j*.

Matthias Studer and Alexis Gabadinho (first version) (with Gilbert Ritschard for the help page)

Gabadinho, A., G. Ritschard, N. S. Müller and M. Studer (2011). Analyzing and Visualizing State Sequences in R with TraMineR. *Journal of Statistical Software* **40**(4), 1-37.

Gabadinho, A., G. Ritschard, M. Studer and N. S. Müller (2010). Mining Sequence Data in
`R`

with the `TraMineR`

package: A user's guide. Department of Econometrics and
Laboratory of Demography, University of Geneva.

`seqtrate`

, `seqdef`

, `seqdist`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ```
## Defining a sequence object with columns 10 to 25
## in the 'biofam' example data set
data(biofam)
biofam.seq <- seqdef(biofam,10:25)
## Optimal matching using transition rates based substitution-cost matrix
## and insertion/deletion costs of 3
trcost <- seqsubm(biofam.seq, method="TRATE")
biofam.om <- seqdist(biofam.seq,method="OM",indel=3,sm=trcost)
## Optimal matching using constant value (2) substitution-cost matrix
## and insertion/deletion costs of 3
ccost <- seqsubm(biofam.seq, method="CONSTANT", cval=2)
biofam.om.c2 <- seqdist(biofam.seq, method="OM",indel=3,sm=ccost)
## Displaying the distance matrix for the first 10 sequences
biofam.om.c2[1:10,1:10]
## =================================
## Example with weights and missings
## =================================
data(ex1)
ex1.seq <- seqdef(ex1,1:13, weights=ex1$weights)
## Unweighted
subm <- seqsubm(ex1.seq, method="TRATE", with.missing=TRUE, weighted=FALSE)
ex1.om <- seqdist(ex1.seq, method="OM", sm=subm, with.missing=TRUE)
## Weighted
subm.w <- seqsubm(ex1.seq, method="TRATE", with.missing=TRUE, weighted=TRUE)
ex1.omw <- seqdist(ex1.seq, method="OM", sm=subm.w, with.missing=TRUE)
ex1.om == ex1.omw
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.