word2phrase | R Documentation |
Convert words to phrases in a text file.
word2phrase(train_file, output_file, debug_mode = 0, min_count = 5, threshold = 100, force = FALSE)
train_file |
Path of a single .txt file for training. Tokens are split on spaces. |
output_file |
Path of output file |
debug_mode |
debug mode. Must be 0, 1 or 2. 0 is silent; 1 print summary statistics; prints progress regularly. |
min_count |
Minimum times a word must appear to be included in the samples. High values help reduce model size. |
threshold |
Threshold value for determining if pairs of words are phrases. |
force |
Whether to overwrite existing files at the output location. Default FALSE |
This function attempts to learn phrases given a text document. It does so by progressively joining adjacent pairs of words with an '_' character. You can then run the code multiple times to create multiword phrases. Wrapper around code from the Mikolov's original word2vec release.
The name of output_file, the trained file where common phrases are now joined.
Tomas Mikolov
## Not run: model=word2phrase("text8","vec.txt") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.