range.textspace: Get the scale of embedding similarities alongside a Starspace...
In ruimtehol: Learn Text 'Embeddings' with 'Starspace'

range.textspace

R Documentation

Get the scale of embedding similarities alongside a Starspace model

Description

Calculates embedding similarities between 2 embedding matrices and gets the range of resulting similarities.

Usage

## S3 method for class 'textspace'
range(
  x,
  from = as.matrix(x),
  to = as.matrix(x, type = "labels"),
  probs = seq(0, 1, by = 0.01),
  breaks = "scott",
  ...
)

Arguments

`x`	an object of class `textspace` as returned by `starspace` or `starspace_load_model`
`from`	an embedding matrix. Defaults to the embeddings of all the labels and the words from the model.
`to`	an embedding matrix. Defaults to the embeddings of all the labels.
`probs`	numeric vector of probabilities ranging from 0-1. Passed on to `quantile`
`breaks`	passed on to `hist`
`...`	other parameters passed on to `hist`

Value

a list with elements

range: the range of the embedding similarities between from and to
quantile: the quantiles of the embedding similarities between from and to
hist: the histogram of the embedding similarities between from and to

Examples

data(dekamer, package = "ruimtehol")
dekamer <- subset(dekamer, depotdat < as.Date("2017-02-01"))
dekamer$text <- strsplit(dekamer$question, "\\W")
dekamer$text <- lapply(dekamer$text, FUN = function(x) setdiff(x, ""))
dekamer$text <- sapply(dekamer$text, 
                       FUN = function(x) paste(x, collapse = " "))
dekamer$question_theme_main <- gsub(" ", "-", dekamer$question_theme_main)

set.seed(123456789)
model <- embed_tagspace(x = tolower(dekamer$text), 
                        y = dekamer$question_theme_main, 
                        early_stopping = 0.8, 
                        dim = 10, minCount = 5)
ranges <- range(model)
ranges$range
ranges$quantile
plot(ranges$hist, main = "Histogram of embedding similarities")

ruimtehol documentation built on May 29, 2024, 5:26 a.m.