View source: R/plot_nns_ratio.R
plot_nns_ratio | R Documentation |
get_nns_ratio()
A way of visualizing the top nearest neighbors of a pair of ALC embeddings that captures how "discriminant" each feature is of each embedding (group).
plot_nns_ratio(x, alpha = 0.01, horizontal = TRUE)
x |
output of get_nns_ratio |
alpha |
(numerical) betwee 0 and 1. Significance threshold to identify significant values.
These are denoted by a |
horizontal |
(logical) defines the type of plot. if TRUE results are plotted on 1 dimension. If FALSE, results are plotted on 2 dimensions, with the second dimension catpuring the ranking of cosine ratio similarties. |
a ggplot-class
object.
library(ggplot2) library(quanteda) # tokenize corpus toks <- tokens(cr_sample_corpus) # build a tokenized corpus of contexts sorrounding a target term immig_toks <- tokens_context(x = toks, pattern = "immigration", window = 6L) # sample 100 instances of the target term, stratifying by party (only for example purposes) set.seed(2022L) immig_toks <- tokens_sample(immig_toks, size = 100, by = docvars(immig_toks, 'party')) # we limit candidates to features in our corpus feats <- featnames(dfm(immig_toks)) # compute ratio set.seed(2022L) immig_nns_ratio <- get_nns_ratio(x = immig_toks, N = 10, groups = docvars(immig_toks, 'party'), numerator = "R", candidates = feats, pre_trained = cr_glove_subset, transform = TRUE, transform_matrix = cr_transform, bootstrap = FALSE, num_bootstraps = 100, permute = FALSE, num_permutations = 10, verbose = FALSE) plot_nns_ratio(x = immig_nns_ratio, alpha = 0.01, horizontal = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.