| most_similar | R Documentation | 
Find the Top-N most similar words, which replicates the results produced by the Python gensim module most_similar() function. (Exact replication of gensim requires the same word vectors data, not the demodata used here in examples.)
most_similar(
  data,
  x = NULL,
  topn = 10,
  above = NULL,
  keep = FALSE,
  row.id = TRUE,
  verbose = TRUE
)
data | 
 A   | 
x | 
 Can be: 
  | 
topn | 
 Top-N most similar words. Defaults to   | 
above | 
 Defaults to  
 If both   | 
keep | 
 Keep words specified in   | 
row.id | 
 Return the row number of each word? Defaults to   | 
verbose | 
 Print information to the console? Defaults to   | 
A data.table with the most similar words and their cosine similarities.
Download pre-trained word vectors data (.RData): https://psychbruce.github.io/WordVector_RData.pdf
sum_wordvec()
dict_expand()
dict_reliability()
cosine_similarity()
pair_similarity()
plot_similarity()
tab_similarity()
d = as_embed(demodata, normalize=TRUE)
most_similar(d)
most_similar(d, "China")
most_similar(d, c("king", "queen"))
most_similar(d, cc(" king , queen ; man | woman "))
# the same as above:
most_similar(d, ~ China)
most_similar(d, ~ king + queen)
most_similar(d, ~ king + queen + man + woman)
most_similar(d, ~ boy - he + she)
most_similar(d, ~ Jack - he + she)
most_similar(d, ~ Rose - she + he)
most_similar(d, ~ king - man + woman)
most_similar(d, ~ Tokyo - Japan + China)
most_similar(d, ~ Beijing - China + Japan)
most_similar(d, "China", above=0.7)
most_similar(d, "China", above="Shanghai")
# automatically normalized for more accurate results
ms = most_similar(demodata, ~ king - man + woman)
ms
str(ms)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.