train_word2vec: Train a model by word2vec.
In bmschmidt/wordVectors: Tools for creating and analyzing vector-space models of texts

train_word2vec

R Documentation

Train a model by word2vec.

Description

Train a model by word2vec.

Usage

train_word2vec(train_file, output_file = "vectors.bin", vectors = 100,
  threads = 1, window = 12, classes = 0, cbow = 0, min_count = 5,
  iter = 5, force = F, negative_samples = 5)

Arguments

`train_file`	Path of a single .txt file for training. Tokens are split on spaces.
`output_file`	Path of the output file.
`vectors`	The number of vectors to output. Defaults to 100. More vectors usually means more precision, but also more random error, higher memory usage, and slower operations. Sensible choices are probably in the range 100-500.
`threads`	Number of threads to run training process on. Defaults to 1; up to the number of (virtual) cores on your machine may speed things up.
`window`	The size of the window (in words) to use in training.
`classes`	Number of classes for k-means clustering. Not documented/tested.
`cbow`	If 1, use a continuous-bag-of-words model instead of skip-grams. Defaults to false (recommended for newcomers).
`min_count`	Minimum times a word must appear to be included in the samples. High values help reduce model size.
`iter`	Number of passes to make over the corpus in training.
`force`	Whether to overwrite existing model files.
`negative_samples`	Number of negative samples to take in skip-gram training. 0 means full sampling, while lower numbers give faster training. For large corpora 2-5 may work; for smaller corpora, 5-15 is reasonable.

Details

The word2vec tool takes a text corpus as input and produces the word vectors as output. It first constructs a vocabulary from the training text data and then learns vector representation of words. The resulting word vector file can be used as features in many natural language processing and machine learning applications.

Value

A VectorSpaceModel object.

Author(s)

Jian Li <rweibo@sina.com>, Ben Schmidt <bmchmidt@gmail.com>

References

https://code.google.com/p/word2vec/

Examples

## Not run: 
model = train_word2vec(system.file("examples", "rfaq.txt", package = "wordVectors"))

## End(Not run)

bmschmidt/wordVectors documentation built on June 2, 2022, 3:53 p.m.

bmschmidt/wordVectors index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bmschmidt/wordVectors
Tools for creating and analyzing vector-space models of texts

train_word2vec: Train a model by word2vec.
In bmschmidt/wordVectors: Tools for creating and analyzing vector-space models of texts

Train a model by word2vec.

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to train_word2vec in bmschmidt/wordVectors...

R Package Documentation

Browse R Packages

We want your feedback!

bmschmidt/wordVectors Tools for creating and analyzing vector-space models of texts

train_word2vec: Train a model by word2vec. In bmschmidt/wordVectors: Tools for creating and analyzing vector-space models of texts

Train a model by word2vec.

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Related to train_word2vec in bmschmidt/wordVectors...

R Package Documentation

Browse R Packages

We want your feedback!

bmschmidt/wordVectors
Tools for creating and analyzing vector-space models of texts

train_word2vec: Train a model by word2vec.
In bmschmidt/wordVectors: Tools for creating and analyzing vector-space models of texts