knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

You must run setup_word2vec at the begining of every session, you will otherwise encounter errors and be prompted to do so.

You must run setup_word2vec at the begining of every session, you will otherwise encounter errors and be prompted to do so.

library(word2vec.r)

# setup word2vec Julia dependency
setup_word2vec()

The package comes with a dataset, Macbeth by Shakespeare. However, being a corpus of 17,319 words it is not lazyly loaded and needs to be imported manually with the data function. Note that the dataset is mildly preprocessed, all words are lowercase and punctuation has been removed.

data("macbeth", package = "word2vec.r")

Functions

You can also cluster words.

model_path <- word2clusters(macbeth, classes = 50L) # train model
model <- word_clusters(model_path)

We provide both a functional API and a reference class.

Functional

get_cluster(model, "king")
get_cluster(model, "macbeth")

Reference Class

We provide a reference class, because it is tedious to specify the vectors (model object in this example) as first argument to every functionv.

wc <- WordClusters$new(model)
wc$get_words(4L)
wc$clusters()


news-r/word2vec.r documentation built on Nov. 4, 2019, 9:41 p.m.