embed_docspace | R Documentation |
Build a Starspace model for content-based recommendation (docspace). For example a user clicks on a webpage and this webpage contains a bunch or words.
embed_docspace(
x,
model = "docspace.bin",
early_stopping = 0.75,
useBytes = FALSE,
...
)
x |
a data.frame with user interest containing the columns user_id, doc_id and text The user_id is an identifier of a user The doc_id is just an article or document identifier the text column is a character field which contains words which are part of the doc_id, words should be separated by a space and should not contain any tab characters |
model |
name of the model which will be saved, passed on to |
early_stopping |
the percentage of the data that will be used as training data. If set to a value smaller than 1, 1- |
useBytes |
set to TRUE to avoid re-encoding when writing out train and/or test files. See |
... |
further arguments passed on to |
an object of class textspace
as returned by starspace
.
library(udpipe)
data(dekamer, package = "ruimtehol")
data(dekamer_theme_terminology, package = "ruimtehol")
## Which person is interested in which theme (aka document)
x <- table(dekamer$aut_person, dekamer$question_theme_main)
x <- as.data.frame(x)
colnames(x) <- c("user_id", "doc_id", "freq")
## Characterise the themes (aka document)
docs <- split(dekamer_theme_terminology, dekamer_theme_terminology$theme)
docs <- lapply(docs, FUN=function(x){
data.frame(theme = x$theme[1], text = paste(x$term, collapse = " "),
stringsAsFactors=FALSE)
})
docs <- do.call(rbind, docs)
## Build a model
train <- merge(x, docs, by.x = "doc_id", by.y = "theme")
train <- subset(train, user_id %in% sample(levels(train$user_id), 4))
set.seed(123456789)
model <- embed_docspace(train, dim = 10, early_stopping = 1)
plot(model)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.