Description Usage Arguments Value See Also Examples

The similarity between word vectors is defined

for type 'dot': as the square root of the average inner product of the vector elements (sqrt(sum(x . y) / ncol(x))) capped to zero

for type 'cosine': as the the cosine similarity, namely sum(x . y) / (sum(x^2)*sum(y^2))

1 | ```
word2vec_similarity(x, y, top_n = +Inf, type = c("dot", "cosine"))
``` |

`x` |
a matrix with embeddings where the rownames of the matrix provide the label of the term |

`y` |
a matrix with embeddings where the rownames of the matrix provide the label of the term |

`top_n` |
integer indicating to return only the top n most similar terms from y for each row of x.
If |

`type` |
character string with the type of similarity. Either 'dot' or 'cosine'. Defaults to 'dot'. |

By default, the function returns a similarity matrix between the rows of `x`

and the rows of `y`

.
The similarity between row i of `x`

and row j of `y`

is found in cell `[i, j]`

of the returned similarity matrix.

If `top_n`

is provided, the return value is a data.frame with columns term1, term2, similarity and rank
indicating the similarity between the provided terms in `x`

and `y`

ordered from high to low similarity and keeping only the top_n most similar records.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
x <- matrix(rnorm(6), nrow = 2, ncol = 3)
rownames(x) <- c("word1", "word2")
y <- matrix(rnorm(15), nrow = 5, ncol = 3)
rownames(y) <- c("term1", "term2", "term3", "term4", "term5")
word2vec_similarity(x, y)
word2vec_similarity(x, y, top_n = 1)
word2vec_similarity(x, y, top_n = 2)
word2vec_similarity(x, y, top_n = +Inf)
word2vec_similarity(x, y, type = "cosine")
word2vec_similarity(x, y, top_n = 1, type = "cosine")
word2vec_similarity(x, y, top_n = 2, type = "cosine")
word2vec_similarity(x, y, top_n = +Inf, type = "cosine")
## Example with a word2vec model
path <- system.file(package = "word2vec", "models", "example.bin")
model <- read.word2vec(path)
emb <- as.matrix(model)
x <- emb[c("gastheer", "gastvrouw", "kamer"), ]
y <- emb
word2vec_similarity(x, x)
word2vec_similarity(x, y, top_n = 3)
predict(model, x, type = "nearest", top_n = 3)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.