vectorize.words | R Documentation |
Vectorize words from a corpus of documents.
vectorize.words(
corpus = NULL,
ndim = 50,
maxwords = NULL,
mincount = 5,
minphrasecount = NULL,
window = 5,
maxcooc = 10,
maxiter = 10,
epsilon = 0.01,
lang = "en",
stopwords = lang,
...
)
corpus |
The corpus of documents (a vector of characters). |
ndim |
The number of dimensions of the vector space. |
maxwords |
The maximum number of words. |
mincount |
Minimum word count to be considered as frequent. |
minphrasecount |
Minimum collocation of words count to be considered as frequent. |
window |
Window for term-co-occurence matrix construction. |
maxcooc |
Maximum number of co-occurrences to use in the weighting function. |
maxiter |
The maximum number of iteration to fit the GloVe model. |
epsilon |
Defines early stopping strategy when fit the GloVe model. |
lang |
The language of the documents (NULL if no stemming). |
stopwords |
Stopwords, or the language of the documents. NULL if stop words should not be removed. |
... |
Other parameters. |
The vectorized words.
query.words
, stopwords
, vectorizers
## Not run:
text = loadtext ("http://mattmahoney.net/dc/text8.zip")
words = vectorize.words (text, minphrasecount = 50)
query.words (words, origin = "paris", sub = "france", add = "germany")
query.words (words, origin = "berlin", sub = "germany", add = "france")
query.words (words, origin = "new_zealand")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.