textSimilarityMatrix: Compute semantic similarity scores between all combinations...

View source: R/3_1_textSimilarity.R

textSimilarityMatrixR Documentation

Compute semantic similarity scores between all combinations in a word embedding

Description

Compute semantic similarity scores between all combinations in a word embedding

Usage

textSimilarityMatrix(x, method = "cosine", center = TRUE, scale = FALSE)

Arguments

x

Word embeddings from textEmbed.

method

(character) Character string describing type of measure to be computed. Default is "cosine" (see also "spearmen", "pearson" as well as measures from textDistance() (which here is computed as 1 - textDistance) including "euclidean", "maximum", "manhattan", "canberra", "binary" and "minkowski").

center

(boolean; from base::scale) If center is TRUE then centering is done by subtracting the column means (omitting NAs) of x from their corresponding columns, and if center is FALSE, no centering is done.

scale

(boolean; from base::scale) If scale is TRUE then scaling is done by dividing the (centered) columns of x by their standard deviations if center is TRUE, and the root mean square otherwise.

Value

A matrix of semantic similarity scores

See Also

see textSimilarityNorm

Examples

similarity_scores <- textSimilarityMatrix(word_embeddings_4$texts$harmonytext[1:3, ])
round(similarity_scores, 3)

text documentation built on Sept. 11, 2024, 7:22 p.m.