plot_nns_ratio: Plot output of 'get_nns_ratio()'
In prodriguezsosa/conText: 'a la Carte' on Text (ConText) Embedding Regression

plot_nns_ratio

R Documentation

Plot output of `get_nns_ratio()`

Description

A way of visualizing the top nearest neighbors of a pair of ALC embeddings that captures how "discriminant" each feature is of each embedding (group).

Usage

plot_nns_ratio(x, alpha = 0.01, horizontal = TRUE)

Arguments

`x`	output of get_nns_ratio
`alpha`	(numerical) betwee 0 and 1. Significance threshold to identify significant values. These are denoted by a `*` on the plot.
`horizontal`	(logical) defines the type of plot. if TRUE results are plotted on 1 dimension. If FALSE, results are plotted on 2 dimensions, with the second dimension catpuring the ranking of cosine ratio similarties.

Value

a ggplot-class object.

Examples


library(ggplot2)
library(quanteda)

# tokenize corpus
toks <- tokens(cr_sample_corpus)

# build a tokenized corpus of contexts sorrounding a target term
immig_toks <- tokens_context(x = toks, pattern = "immigration", window = 6L)

# sample 100 instances of the target term, stratifying by party (only for example purposes)
set.seed(2022L)
immig_toks <- tokens_sample(immig_toks, size = 100, by = docvars(immig_toks, 'party'))

# we limit candidates to features in our corpus
feats <- featnames(dfm(immig_toks))

# compute ratio
set.seed(2022L)
immig_nns_ratio <- get_nns_ratio(x = immig_toks,
                                 N = 10,
                                 groups = docvars(immig_toks, 'party'),
                                 numerator = "R",
                                 candidates = feats,
                                 pre_trained = cr_glove_subset,
                                 transform = TRUE,
                                 transform_matrix = cr_transform,
                                 bootstrap = TRUE,
                                 # num_bootstraps should be at least 100,
                                 # we use 10 here due to CRAN-imposed constraints
                                 # on example execution time
                                 num_bootstraps = 100,
                                 permute = FALSE,
                                 num_permutations = 10,
                                 verbose = FALSE)

plot_nns_ratio(x = immig_nns_ratio, alpha = 0.01, horizontal = TRUE)

prodriguezsosa/conText documentation built on April 23, 2024, 7:04 p.m.