corshrink_word2vec: Adaptive Shrinkage of cosine similarities in word2vec output.

Description Usage Arguments Value

Description

Performs an adaptive shrinkage (ash) of the cosine similarities in word2vec output. The ash framework has been proposed by Matthew Stephens (2016).

Usage

1
2
corshrink_word2vec(model_true, model_boot_list, word_vec,
  num_related_words = 500)

Arguments

model_true

A word2vec VectorSpace model output obtained by fitting the model on corpus.

model_boot_list

A list of VectorSpace models obtained by fitting the word2vec on resampled corpus data.

word_vec

A word set or a vector of words of interest. Shrinkage of cosine similarities for the primary linked words to each word in this word set.

num_related_words

The number of primary linked words to each word taken into the model.

Value

Returns a list with each element corresponding to ash results for each word in word_vec. An element of the list is anopther list containing the following features. similar_words : all the words taken into the shrinkage model for each word in word_vec. cosine_est: cosine similarities from the original word2vec model sd_cosine_transform_est: standard error of Fisher z-scores obtained from cosine similarities of primary links with each word in word_vec from resampled corpus data. ash_out: The ash result for each word in word_vec ash_cosine_est: ash shrunk cosine similarities for each word in word_vec.


kkdey/WEAVER documentation built on May 8, 2019, 9:24 a.m.