embed_docspace | R Documentation |
Build a Starspace model for content-based recommendation (docspace). For example a user clicks on a webpage and this webpage contains a bunch or words.
embed_docspace( x, model = "docspace.bin", early_stopping = 0.75, useBytes = FALSE, ... )
x |
a data.frame with user interest containing the columns user_id, doc_id and text The user_id is an identifier of a user The doc_id is just an article or document identifier the text column is a character field which contains words which are part of the doc_id, words should be separated by a space and should not contain any tab characters |
model |
name of the model which will be saved, passed on to |
early_stopping |
the percentage of the data that will be used as training data. If set to a value smaller than 1, 1- |
useBytes |
set to TRUE to avoid re-encoding when writing out train and/or test files. See |
... |
further arguments passed on to |
an object of class textspace
as returned by starspace
.
library(udpipe) data(dekamer, package = "ruimtehol") data(dekamer_theme_terminology, package = "ruimtehol") ## Which person is interested in which theme (aka document) x <- table(dekamer$aut_person, dekamer$question_theme_main) x <- as.data.frame(x) colnames(x) <- c("user_id", "doc_id", "freq") ## Characterise the themes (aka document) docs <- split(dekamer_theme_terminology, dekamer_theme_terminology$theme) docs <- lapply(docs, FUN=function(x){ data.frame(theme = x$theme[1], text = paste(x$term, collapse = " "), stringsAsFactors=FALSE) }) docs <- do.call(rbind, docs) ## Build a model train <- merge(x, docs, by.x = "doc_id", by.y = "theme") train <- subset(train, user_id %in% sample(levels(train$user_id), 4)) set.seed(123456789) model <- embed_docspace(train, dim = 10, early_stopping = 1) plot(model)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.