View source: R/r-all-the-things.R
embed_tagspace | R Documentation |
Build a Starspace model to be used for classification purposes
embed_tagspace( x, y, model = "tagspace.bin", early_stopping = 0.75, useBytes = FALSE, ... )
x |
a character vector of text where tokens are separated by spaces |
y |
a character vector of classes to predict or a list with the same length of |
model |
name of the model which will be saved, passed on to |
early_stopping |
the percentage of the data that will be used as training data. If set to a value smaller than 1, 1- |
useBytes |
set to TRUE to avoid re-encoding when writing out train and/or test files. See |
... |
further arguments passed on to |
an object of class textspace
as returned by starspace
.
data(dekamer, package = "ruimtehol") dekamer <- subset(dekamer, depotdat < as.Date("2017-02-01")) dekamer$text <- strsplit(dekamer$question, "\\W") dekamer$text <- lapply(dekamer$text, FUN = function(x) x[x != ""]) dekamer$text <- sapply(dekamer$text, FUN = function(x) paste(x, collapse = " ")) dekamer$question_theme_main <- gsub(" ", "-", dekamer$question_theme_main) set.seed(123456789) model <- embed_tagspace(x = tolower(dekamer$text), y = dekamer$question_theme_main, early_stopping = 0.8, dim = 10, minCount = 5) plot(model) predict(model, "de nmbs heeft het treinaanbod uitgebreid", k = 3) predict(model, "de migranten komen naar europa, in asielcentra ...") starspace_embedding(model, "de nmbs heeft het treinaanbod uitgebreid") starspace_embedding(model, "__label__MIGRATIEBELEID", type = "ngram") dekamer$question_themes <- gsub(" ", "-", dekamer$question_theme) dekamer$question_themes <- strsplit(dekamer$question_themes, split = ",") set.seed(123456789) model <- embed_tagspace(x = tolower(dekamer$text), y = dekamer$question_themes, early_stopping = 0.8, dim = 50, minCount = 2, epoch = 50) plot(model) predict(model, "de nmbs heeft het treinaanbod uitgebreid") predict(model, "de migranten komen naar europa , in asielcentra ...") embeddings_labels <- as.matrix(model, type = "labels") emb <- starspace_embedding(model, "de nmbs heeft het treinaanbod uitgebreid") embedding_similarity(emb, embeddings_labels, type = "cosine", top_n = 5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.