embed_terms: Generate Embeddings of Terms

Description Usage Arguments Value Examples

View source: R/term_embeddings.R

Description

Generate embeddings of terms based on descriptions of visits with using the GloVe algorithm. By default the order of the terms is skipped (all weights in the term coocurrence matrix are equal to 1) and only terms occurring at least 5 times are embedded.

Usage

1
2
embed_terms(merged_terms, embedding_size = 20L, term_count_min = 5L,
  x_max = 10L, n_iter = 15L)

Arguments

merged_terms

A character vector of visits' descriptions with terms separated by ", "

embedding_size

An integer (default: 20)

term_count_min

A minimum number of occurences of term to be embedded (default: 5)

x_max

A x_max parameter of GloVe, see ?text2vec::GlobalVectors (default: 10)

n_iter

A number of epochs of GloVe (default: 15)

Value

A matrix of embeddings of the terms.

Examples

1
2
3
4
5
6
7
8
9
inter_term_vectors <- embed_terms(interviews,
  term_count_min = 1L)
inter_term_vectors
inter_term_vectors <- embed_terms(interviews,
  term_count_min = 1L, embedding_size = 10L)
inter_term_vectors
inter_term_vectors <- embed_terms(interviews, embedding_size = 10L,
term_count_min = 1, n_iter = 50, x_max = 20)
inter_term_vectors

adamgdobrakowski/memr documentation built on Sept. 4, 2021, 3:45 a.m.