read_word2vec | R Documentation |
Read a word2vec embedding file as a dense matrix. This uses read.wordvectors
from the word2vec package.
read_word2vec( x, type = c("txt", "bin"), n = .Machine$integer.max, encoding = "UTF-8", normalize = TRUE )
x |
path to the file |
type |
either 'bin' or 'txt' indicating the |
n |
integer, indicating to limit the number of words to read in. Defaults to reading all words. |
encoding |
encoding to be assumed for the words. Defaults to 'UTF-8' |
normalize |
logical indicating to normalize the embeddings by dividing by the factor (sqrt(sum(x . x) / length(x))). Defaults to TRUE. |
a matrix with one row per token containing the embedding of the token
read.wordvectors
folder <- system.file(package = "sentencepiece", "models") embedding <- file.path(folder, "nl.wiki.bpe.vs1000.d25.w2v.bin") embedding <- read_word2vec(embedding, type = "bin") head(embedding) embedding <- file.path(folder, "nl.wiki.bpe.vs1000.d25.w2v.txt") embedding <- read_word2vec(embedding, type = "txt") head(embedding, n = 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.