glove: Extract word vectors from GloVe word embedding
In wordsalad: Provide Tools to Extract and Analyze Word Vectors

Description Usage Arguments Value Source References Examples

View source: R/glove.R

The calculations are done with the text2vec package.

glove(
  text,
  tokenizer = text2vec::space_tokenizer,
  dim = 10L,
  window = 5L,
  min_count = 5L,
  n_iter = 10L,
  x_max = 10L,
  stopwords = character(),
  convergence_tol = -1,
  threads = 1,
  composition = c("tibble", "data.frame", "matrix"),
  verbose = FALSE
)

`text`	Character string.
`tokenizer`	Function, function to perform tokenization. Defaults to text2vec::space_tokenizer.
`dim`	Integer, number of dimension of the resulting word vectors.
`window`	Integer, skip length between words. Defaults to 5.
`min_count`	Integer, number of times a token should appear to be considered in the model. Defaults to 5.
`n_iter`	Integer, number of training iterations. Defaults to 10.
`x_max`	Integer, maximum number of co-occurrences to use in the weighting function. Defaults to 10.
`stopwords`	Character, a vector of stop words to exclude from training.
`convergence_tol`	Numeric, value determining the convergence criteria. `numeric = -1` defines early stopping strategy. Stop fitting when one of two following conditions will be satisfied: (a) passed all iterations (b) `cost_previous_iter / cost_current_iter - 1 < convergence_tol`. Defaults to -1.
`threads`	number of CPU threads to use. Defaults to 1.
`composition`	Character, Either "tibble", "matrix", or "data.frame" for the format out the resulting word vectors.
`verbose`	Logical, controls whether progress is reported as operations are executed.