High performance distances and similarities for various dense and sparse representations with primary focus on applications in NLP and recommender systems.
matrix
from base RdgCMatrix
, dgRMatrix
and dgTMatrix
from Matrix packagesimple_triplet_matrix
from slam
packagedata.frames
in primary-secondary-value (psv) formatlist
of named numeric or character vectors| | matrix
| dgCMatrix
| dgRMatrix
| dgTMatrix
| slam
| psv
| list
|
| ---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| cosine
| ✔ | ✔ | ✔ | ✔ | | ✔ | |
| euclidean
| ✔ | ✔ | ✔ | ✔ | | ✔ | |
| mahalanobis
| | | | | | | |
| jaccard
| | | | | | | |
| | dgCMatrix
| dgRMatrix
| dgTMatrix
| slam
| psv
| list
|
| ---: | :---: | :---: | :---: | :---: | :---: | :---: |
| centroid
| ✔ | ✔ | ✔ | | ✔ | |
| semantic_min_max
1 | ✔ | ✔ | ✔ | | ✔ | |
| semantic_min_sum
2 | ✔ | ✔ | ✔ | | ✔ | |
[1] More commonly known as "Relaxed Word Mover Distance" (RWMD) proposed in Kusner et. al. ‘From Word Embeddings To Document Distances’ (2015).
[2] Similar to RWMD measure, proposed in Mihalcea et.al. 'Corpus-Based and Knowledge-Based Measures of Text Semantic Similarity' (2006)
norm_l1
, norm_l2
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.