cs: Cosine Similarity
In scottmanski/TAGAM:

Description Usage Arguments Details Value References See Also Examples

View source: R/cs.R

This function finds the cosine similarity between two vectors of words.

1	cs(a, b, word_embeddings)

`a, b`	characters or character vectors containing words in `word_embeddings`.
`word_embeddings`	named list of word embeddings. See `formatWordEmbeddings`.

Consider 2 words with word embedding representations a and b. Then the cosine similarity is defined as

sim_cos(a,b)=(a \cdot b)/(|| a ||_2 \cdot || b ||_2)

If A = (a_1,...,a_n) and B = (b_1,...,b_m), then the result is a matrix of m \times n dimension with each entry in cell (i, j) defined as sim_cos(a_j, b_i).

a matrix of cosine similarities

Goldberg, Y. (2017) Neural Network Methods for Natural Language Processing. San Rafael, CA: Morgan & Claypool Publishers.

formatWordEmbeddings

## Not run: 

word_embeddings <- formatWordEmbeddings(embedding_matrix_example, normalize = TRUE)

a <- "home"
b <- "house"
cs(a, b, word_embeddings)

a <- c("home", "apartment", "mansion")
b <- c("my", "dog", "sleeps", "in", "her", "dog", "house")
cs(a, b, word_embeddings)

## End(Not run)