# Create a substitution-cost matrix

### Description

The substitution-cost matrix is used when computing distances between sequences by the method of optimal matching. The function creates the substitution matrix using either a constant or the transition rates computed from the sequence data or other methods to be implemented in the future.

### Usage

1 2 3 |

### Arguments

`seqdata` |
a sequence object as returned by the seqdef function. |

`method` |
method to compute transition rates. At this time, the methods available are constant
value ( |

`cval` |
the constant substitution cost if method |

`with.missing` |
if |

`miss.cost` |
the substitution cost for the missing state. The default set it to |

`time.varying` |
Logical. If |

`weighted` |
Logical. If |

`transition` |
Only used if |

`lag` |
Integer. Only used with ( |

`missing.trate` |
Logical. Only used with ( |

### Details

The substitution-cost matrix has dimension *ns*ns*, where
*ns* is the number of states in the alphabet of the
sequence object. The element *(i,j)* of the matrix is the cost of
substituting state *i* with state *j*.

With the `"CONSTANT"`

method, the substitution costs are the
same for all the states, with a default value of 2. An alternative
value can be provided by the user. When the `"TRATE"`

(transition rates) method is chosen, the transition rates between all
states are computed using the seqtrate function. The
substitution cost between states *i* and *j* is obtained with
the formula

*SC(i,j) = cval -P(i,j) -P(j,i)*

where *P(i,j)* is the transition rate from state *i* to
*j*.

### Author(s)

Matthias Studer and Alexis Gabadinho (first version) (with Gilbert Ritschard for the help page)

### References

Gabadinho, A., G. Ritschard, N. S. Müller and M. Studer (2011). Analyzing and Visualizing State Sequences in R with TraMineR. *Journal of Statistical Software* **40**(4), 1-37.

Gabadinho, A., G. Ritschard, M. Studer and N. S. Müller (2010). Mining Sequence Data in
`R`

with the `TraMineR`

package: A user's guide. Department of Econometrics and
Laboratory of Demography, University of Geneva.

### See Also

`seqtrate`

, `seqdef`

, `seqdist`

.

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ```
## Defining a sequence object with columns 10 to 25
## in the 'biofam' example data set
data(biofam)
biofam.seq <- seqdef(biofam,10:25)
## Optimal matching using transition rates based substitution-cost matrix
## and insertion/deletion costs of 3
trcost <- seqsubm(biofam.seq, method="TRATE")
biofam.om <- seqdist(biofam.seq,method="OM",indel=3,sm=trcost)
## Optimal matching using constant value (2) substitution-cost matrix
## and insertion/deletion costs of 3
ccost <- seqsubm(biofam.seq, method="CONSTANT", cval=2)
biofam.om.c2 <- seqdist(biofam.seq, method="OM",indel=3,sm=ccost)
## Displaying the distance matrix for the first 10 sequences
biofam.om.c2[1:10,1:10]
## =================================
## Example with weights and missings
## =================================
data(ex1)
ex1.seq <- seqdef(ex1,1:13, weights=ex1$weights)
## Unweighted
subm <- seqsubm(ex1.seq, method="TRATE", with.missing=TRUE, weighted=FALSE)
ex1.om <- seqdist(ex1.seq, method="OM", sm=subm, with.missing=TRUE)
## Weighted
subm.w <- seqsubm(ex1.seq, method="TRATE", with.missing=TRUE, weighted=TRUE)
ex1.omw <- seqdist(ex1.seq, method="OM", sm=subm.w, with.missing=TRUE)
ex1.om == ex1.omw
``` |

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.