README.md
In vspinu/simdist: High performance distance and similarity functions

High performance distances and similarities for various dense and sparse representations with primary focus on applications in NLP and recommender systems.

matrix from base R
dgCMatrix, dgRMatrix and dgTMatrix from Matrix package
simple_triplet_matrix from slam package
data.frames in primary-secondary-value (psv) format
list of named numeric or character vectors

| | matrix | dgCMatrix | dgRMatrix | dgTMatrix | slam | psv | list | | ---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | cosine | ✔ | ✔ | ✔ | ✔ | | ✔ | | | euclidean | ✔ | ✔ | ✔ | ✔ | | ✔ | | | mahalanobis | | | | | | | | | jaccard | | | | | | | |

| | dgCMatrix | dgRMatrix | dgTMatrix | slam | psv | list | | ---: | :---: | :---: | :---: | :---: | :---: | :---: | | centroid | ✔ | ✔ | ✔ | | ✔ | | | semantic_min_max1 | ✔ | ✔ | ✔ | | ✔ | | | semantic_min_sum2 | ✔ | ✔ | ✔ | | ✔ | |

[1] More commonly known as "Relaxed Word Mover Distance" (RWMD) proposed in Kusner et. al. ‘From Word Embeddings To Document Distances’ (2015).

[2] Similar to RWMD measure, proposed in Mihalcea et.al. 'Corpus-Based and Knowledge-Based Measures of Text Semantic Similarity' (2006)

norm_l1, norm_l2.

vspinu/simdist documentation built on May 3, 2019, 7:09 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com