range.textspace: Get the scale of embedding similarities alongside a Starspace...

View source: R/utils.R

range.textspaceR Documentation

Get the scale of embedding similarities alongside a Starspace model

Description

Calculates embedding similarities between 2 embedding matrices and gets the range of resulting similarities.

Usage

## S3 method for class 'textspace'
range(
  x,
  from = as.matrix(x),
  to = as.matrix(x, type = "labels"),
  probs = seq(0, 1, by = 0.01),
  breaks = "scott",
  ...
)

Arguments

x

an object of class textspace as returned by starspace or starspace_load_model

from

an embedding matrix. Defaults to the embeddings of all the labels and the words from the model.

to

an embedding matrix. Defaults to the embeddings of all the labels.

probs

numeric vector of probabilities ranging from 0-1. Passed on to quantile

breaks

passed on to hist

...

other parameters passed on to hist

Value

a list with elements

  • range: the range of the embedding similarities between from and to

  • quantile: the quantiles of the embedding similarities between from and to

  • hist: the histogram of the embedding similarities between from and to

Examples

data(dekamer, package = "ruimtehol")
dekamer <- subset(dekamer, depotdat < as.Date("2017-02-01"))
dekamer$text <- strsplit(dekamer$question, "\\W")
dekamer$text <- lapply(dekamer$text, FUN = function(x) setdiff(x, ""))
dekamer$text <- sapply(dekamer$text, 
                       FUN = function(x) paste(x, collapse = " "))
dekamer$question_theme_main <- gsub(" ", "-", dekamer$question_theme_main)

set.seed(123456789)
model <- embed_tagspace(x = tolower(dekamer$text), 
                        y = dekamer$question_theme_main, 
                        early_stopping = 0.8, 
                        dim = 10, minCount = 5)
ranges <- range(model)
ranges$range
ranges$quantile
plot(ranges$hist, main = "Histogram of embedding similarities")                         

ruimtehol documentation built on Jan. 7, 2023, 1:25 a.m.