# Dissimilarity/distance indices for sequence data

### Description

This function calculates different dissmilarity/distance indices of sequences.

### Usage

1 2 3 | ```
sequences.distance(sequences = NULL, groups = NULL,
method = c("levenshtein", "cosine", "q-gram", "jaccard", "ja-wi",
"dam-le", "hamming", "osa", "lcs"), divLength = FALSE)
``` |

### Arguments

`sequences` |
Vector containing sequences |

`groups` |
Vector containing names of different samples (if present) |

`method` |
Dissmilariy method (see details) |

`divLength` |
Divide sequences into subsets of the same sequence length? (default: FALSE) |

### Details

This function calculates dissmiliarity/distance indices based on sequences. Levenshtein, cosine, q-gram, Jaccard, Jaro-Winker (`ja-wi`

), Damerau-Levenshtein (`dam-le`

), Hamming, Optimal string alignment (`osa`

) and longest common substring (`lcs`

) distance can be chosen. For details see `stringdist-metrics`

.

### Value

Output is a distance matrix containing dissimilarity indices/distances between sequences.

### Author(s)

Julia Bischof

### References

van der Loo M (2014). The stringdist package for approximate string matching. The R Journal, 6, pp. 111-122. http://CRAN.R-project.org/package=stringdist

### See Also

`dist.PCoA`

, `plotDistPCoA`

, `geneUsage.distance`

### Examples

1 2 3 4 5 6 7 8 9 | ```
## Not run:
data(clones.ind)
data(clones.allind)
dist1<-sequences.distance(sequences = clones.ind$unique_CDR3_sequences_AA,
method = "levenshtein", divLength=TRUE)
dist2<-sequences.distance(sequences = clones.allind$unique_CDR3_sequences_AA,
groups = clones.allind$individuals, method = "cosine", divLength=FALSE)
## End(Not run)
``` |

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.