seq_sim: Compute similarity scores between sequences of integers

View source: R/stringsim.R

seq_simR Documentation

Compute similarity scores between sequences of integers

Description

Compute similarity scores between sequences of integers

Usage

seq_sim(
  a,
  b,
  method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw"),
  q = 1,
  ...
)

Arguments

a

list of integer vectors (target)

b

list of integer vectors (source). Optional for seq_distmatrix.

method

Method for distance calculation. The default is "osa", see stringdist-metrics.

q

Size of the q-gram; must be nonnegative. Only applies to method='qgram', 'jaccard' or 'cosine'.

...

additional arguments are passed on to seq_dist.

Value

A numeric vector of length max(length(a),length(b)). If one of the entries in a or b is NA_integer_, all comparisons with that element result in NA. Missings occurring within the sequences are treated as an ordinary number (the representation of NA_integer_).

See Also

seq_dist, seq_amatch

Examples

L1 <- list(1:3,2:4)
L2 <- list(1:3)
seq_sim(L1,L2,method="osa")

# note how missing values are handled (L2 is recycled over L1)
L1 <- list(c(1L,NA_integer_,3L),2:4,NA_integer_)
L2 <- list(1:3)
seq_sim(L1,L2)


stringdist documentation built on May 29, 2024, 11:13 a.m.