seq_sim: Compute similarity scores between sequences of integers

Description Usage Arguments Value See Also Examples

View source: R/stringsim.R

Description

Compute similarity scores between sequences of integers

Usage

1
2
3
4
5
6
7
seq_sim(
  a,
  b,
  method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw"),
  q = 1,
  ...
)

Arguments

a

list of integer vectors (target)

b

list of integer vectors (source). Optional for seq_distmatrix.

method

Method for distance calculation. The default is "osa", see stringdist-metrics.

q

Size of the q-gram; must be nonnegative. Only applies to method='qgram', 'jaccard' or 'cosine'.

...

additional arguments are passed on to seq_dist.

Value

A numeric vector of length max(length(a),length(b)). If one of the entries in a or b is NA_integer_, all comparisons with that element result in NA. Missings occurring within the sequences are treated as an ordinary number (the representation of NA_integer_).

See Also

seq_dist, seq_amatch

Examples

1
2
3
4
5
6
7
8
L1 <- list(1:3,2:4)
L2 <- list(1:3)
seq_sim(L1,L2,method="osa")

# note how missing values are handled (L2 is recycled over L1)
L1 <- list(c(1L,NA_integer_,3L),2:4,NA_integer_)
L2 <- list(1:3)
seq_sim(L1,L2)

stringdist documentation built on Sept. 9, 2021, 5:08 p.m.