wordEmbed: Embed Words to Vectors Using Pre-trained Word2vec Dictionary

Description Usage Arguments Value See Also Examples

Description

Embed words in string to vectors using the pre-trained word2vec dictionary. User can also replace the word2vec dataframe with customized data.

Usage

1
wordEmbed(object, dictionary, meanVec)

Arguments

object

Vectors of text representing documents.

dictionary

Dataframe of pre-trained word2vec dataset. The First column is the word and the following columns are numeric vectors from word2vec models. The default dataset with the package is a pre-trained 20 dimension word2vec dataset.

meanVec

Boolean variable. If meanVec is TRUE, a matrix is returned with each row representing the mean of numeric vectors of all the words in a document. If FALSE, a list of matrix is returned in which each document is represented by a matrix.

Value

wordEmbed returns a matrix if meanVec is TRUE and a list of matrix if meanVec is FALSE.

See Also

document word2vec

Examples

1
2
3
data(word2vec) # load default 20 dimensions word2vec dataset
doc = "This is an example line of document"
docVectors = wordEmbed(doc, word2vec, meanVec = TRUE)

softmaxreg documentation built on May 2, 2019, 6:08 a.m.