starspace_dictionary: Get the dictionary of a Starspace model

View source: R/embed-all-the-things.R

starspace_dictionaryR Documentation

Get the dictionary of a Starspace model

Description

Get the dictionary of a Starspace model

Usage

starspace_dictionary(object)

Arguments

object

an object of class textspace as returned by starspace or starspace_load_model

Value

a list with elements

  1. ntokens: The number of tokens in the data

  2. nwords: The number of words which are part of the dictionary

  3. nlabels: The number of labels which are part of the dictionary

  4. labels: A character vector with the labels

  5. dictionary_size: The size of the dictionary (nwords + nlabels)

  6. dictionary: A data.frame with all the words and labels from the dictionary. This data.frame has columns term, is_word and is_label indicating for each term if it is a word or a label

Examples

data(dekamer, package = "ruimtehol")
dekamer <- subset(dekamer, depotdat < as.Date("2017-02-01"))
dekamer$text <- strsplit(dekamer$question, "\\W")
dekamer$text <- lapply(dekamer$text, FUN = function(x) x[x != ""])
dekamer$text <- sapply(dekamer$text, 
                       FUN = function(x) paste(x, collapse = " "))
dekamer$question_theme_main <- gsub(" ", "-", dekamer$question_theme_main)

set.seed(123456789)
model <- embed_tagspace(x = tolower(dekamer$text), 
                        y = dekamer$question_theme_main, 
                        early_stopping = 0.8, 
                        dim = 10, minCount = 5)
dict <- starspace_dictionary(model)
str(dict)

ruimtehol documentation built on Jan. 7, 2023, 1:25 a.m.