avg_cos_similarity: Computes the average similarity vector between a set of cues...

Description Usage Arguments Value Note

View source: R/avg_cos_similarity.R

Description

Computes the average similarity vector between a set of cues and the full vocab over multiple embedding models

Usage

1
avg_cos_similarity(embeds_list, cues, method = "cosine", norm = "l2")

Arguments

embeds_list

list of trained word embedding models (each is a V by D matrix)

cues

a character vector of cue words for which the similarity vector will be computed

method

a param of text2vec's sim2: character, the similarity measure to be used. One of c("cosine","jaccard").

norm

a param of text2vec's sim2: character = c("l2","none") - how to scale input matrices. If they already scaled - use "none"

Value

a matrix of similarity vectors, one row for each cue

Note

embedding models should not be averaged directly, instead, users should average the similarity vectors. This function makes this easier. See Rodriguez & Spirling.


prodriguezsosa/weeval documentation built on Aug. 15, 2020, 7:01 a.m.