read.wordvectors: Read word vectors from a word2vec model from disk

View source: R/word2vec.R

read.wordvectorsR Documentation

Read word vectors from a word2vec model from disk

Description

Read word vectors from a word2vec model from disk into a dense matrix

Usage

read.wordvectors(
  file,
  type = c("bin", "txt"),
  n = .Machine$integer.max,
  normalize = FALSE,
  encoding = "UTF-8"
)

Arguments

file

the path to the model file

type

either 'bin' or 'txt' indicating the file is a binary file or a text file

n

integer, indicating to limit the number of words to read in. Defaults to reading all words.

normalize

logical indicating to normalize the embeddings by dividing by the factor (sqrt(sum(x . x) / length(x))). Defaults to FALSE.

encoding

encoding to be assumed for the words. Defaults to 'UTF-8'

Value

A matrix with the embeddings of the words. The rownames of the matrix are the words which are by default set to UTF-8 encoding.

Examples

path  <- system.file(package = "word2vec", "models", "example.bin")
embed <- read.wordvectors(path, type = "bin", n = 10)
embed <- read.wordvectors(path, type = "bin", n = 10, normalize = TRUE)
embed <- read.wordvectors(path, type = "bin")

path  <- system.file(package = "word2vec", "models", "example.txt")
embed <- read.wordvectors(path, type = "txt", n = 10)
embed <- read.wordvectors(path, type = "txt", n = 10, normalize = TRUE)
embed <- read.wordvectors(path, type = "txt")

word2vec documentation built on Oct. 8, 2023, 1:07 a.m.