word2phrase: Convert words to phrases

View source: R/word2vec.R

word2phraseR Documentation

Convert words to phrases

Description

Convert words to phrases in a text file.

Usage

word2phrase(train_file, output_file, debug_mode = 0, min_count = 5,
  threshold = 100, force = FALSE)

Arguments

train_file

Path of a single .txt file for training. Tokens are split on spaces.

output_file

Path of output file

debug_mode

debug mode. Must be 0, 1 or 2. 0 is silent; 1 print summary statistics; prints progress regularly.

min_count

Minimum times a word must appear to be included in the samples. High values help reduce model size.

threshold

Threshold value for determining if pairs of words are phrases.

force

Whether to overwrite existing files at the output location. Default FALSE

Details

This function attempts to learn phrases given a text document. It does so by progressively joining adjacent pairs of words with an '_' character. You can then run the code multiple times to create multiword phrases. Wrapper around code from the Mikolov's original word2vec release.

Value

The name of output_file, the trained file where common phrases are now joined.

Author(s)

Tomas Mikolov

Examples

## Not run: 
model=word2phrase("text8","vec.txt")

## End(Not run)

bmschmidt/wordVectors documentation built on June 2, 2022, 3:53 p.m.