starspace_save_model: Save a starspace model as a binary or tab-delimited TSV file

View source: R/embed-all-the-things.R

starspace_save_modelR Documentation

Save a starspace model as a binary or tab-delimited TSV file


Save a starspace model as a binary or a tab-delimited TSV file


  file = "textspace.ruimtehol",
  method = c("ruimtehol", "tsv-data.table", "binary", "tsv-starspace"),
  labels = data.frame(code = character(), label = character(), stringsAsFactors =



an object of class textspace as returned by starspace or starspace_load_model


character string with the path to the file where to save the model


character indicating the method of saving. Possible values are 'ruimtehol', 'binary', 'tsv-starspace' and 'tsv-data.table'. Defaults to 'ruimtehol'.

  • The first method: 'ruimtehol' saves the R object and the embeddings and optionally the label definitions with saveRDS. This object can be loaded back in with starspace_load_model.

  • The second method: 'tsv-data.table' saves the model embeddings as a tab-delimited flat file using the fast data.table fwrite function

  • The third method: 'binary' saves the model as a binary file using the original methods of the Starspace authors

  • The fourth method: 'tsv-starspace' saves the model as a tab-delimited flat file using the original methods of the Starspace authors


a data.frame with at least columns code and label which will be saved in case method is set to 'ruimtehol'. This allows to store the mapping between Starspace labels and your own codes alongside the model, where code is your internal code and label is your label.
A new column will be added to this data.frame called label_starspace which combines the Starspace prefix of the label with the code column of your provided data.frame, as this combination is the label starspace uses internally.


invisibly, the character string with the file of the saved object


It is advised to always use method 'ruimtehol' method as it works nicely together with the starspace_load_model function. It is the advised method unless you need to provide non-R users the models and you prefer using the methods provided by the Starspace authors instead of the faster and more portable 'ruimtehol' method.

See Also



data(dekamer, package = "ruimtehol")
dekamer$text <- strsplit(dekamer$question, "\\W")
dekamer$text <- lapply(dekamer$text, FUN = function(x) x[x != ""])
dekamer$text <- sapply(dekamer$text, 
                       FUN = function(x) paste(x, collapse = " "))

dekamer$target <- as.factor(dekamer$question_theme_main)
codes <- data.frame(code = seq_along(levels(dekamer$target)), 
                    label = levels(dekamer$target), stringsAsFactors = FALSE)
dekamer$target <- as.integer(dekamer$target)
model <- embed_tagspace(x = dekamer$text, 
                        y = dekamer$target, 
                        early_stopping = 0.8,
                        dim = 10, minCount = 5)
starspace_save_model(model, file = "textspace.ruimtehol", method = "ruimtehol",
                     labels = codes)
model <- starspace_load_model("textspace.ruimtehol", method = "ruimtehol")
starspace_save_model(model, file = "embeddings.tsv", method = "tsv-data.table")

## clean up for cran

ruimtehol documentation built on Jan. 7, 2023, 1:25 a.m.