Description Usage Format Fields Usage Methods Arguments Examples

Relaxed word movers distance tries to measure distance between documents by calculating how hard is to transform words from first document into words from second document and vice versa. For more detail see original article: http://mkusner.github.io/publications/WMD.pdf.

1 2 3 |

`R6Class`

object.

`progressbar`

`logical = TRUE`

whether to display progressbar

For usage details see **Methods, Arguments and Examples** sections.

1 2 3 | ```
rwmd = RelaxedWordMoversDistance$new(wv, method = c("cosine", "euclidean"), normalize = TRUE, progressbar = interactive())
rwmd$dist2(x, y)
rwmd$pdist2(x, y)
``` |

`$new(wv, method = c("cosine", "euclidean"))`

Constructor for RWMD model For description of arguments see

**Arguments**section`$dist2(x, y)`

Computes distance between each row of sparse matrix

`x`

and each row of sparse matrix`y`

`$pdist2(x, y)`

Computes "parallel" distance between rows of sparse matrix

`x`

and corresponding rows of the sparse matrix`y`

- rwmd
`RWMD`

object- x
`x`

sparse document term matrix- y
`y = NULL`

sparse document term matrix. If`y = NULL`

(as by default), we will assume`y = x`

- wv
word vectors. Numeric matrix which contains word embeddings. Rows - words, columns - corresponding vectors. Rows should have word names.

- method
name of the distance for measuring similarity between two word vectors. In original paper authors use

`"euclidean"`

, however we use`"cosine"`

by default (better from our experience). This means`distance = 1 - cosine_angle_betwen_wv`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ```
## Not run:
data("movie_review")
tokens = word_tokenizer(tolower(movie_review$review))
v = create_vocabulary(itoken(tokens))
v = prune_vocabulary(v, term_count_min = 5, doc_proportion_max = 0.5)
it = itoken(tokens)
vectorizer = vocab_vectorizer(v)
dtm = create_dtm(it, vectorizer)
tcm = create_tcm(it, vectorizer, skip_grams_window = 5)
glove_model = GloVe$new(word_vectors_size = 50, vocabulary = v, x_max = 10)
wv = glove_model$fit_transform(tcm, n_iter = 10)
# get average of main and context vectors as proposed in GloVe paper
wv = wv + t(glove_model$components)
rwmd_model = RWMD$new(wv)
rwmd_dist = dist2(dtm[1:100, ], dtm[1:10, ], method = rwmd_model, norm = 'none')
head(rwmd_dist)
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.