prototypical_context: Find most "prototypical" contexts.
In conText: 'a la Carte' on Text (ConText) Embedding Regression

prototypical_context

R Documentation

Find most "prototypical" contexts.

Description

Contexts most similar on average to the full set of contexts.

Usage

prototypical_context(
  context,
  pre_trained,
  transform = TRUE,
  transform_matrix,
  N = 3,
  norm = "l2"
)

Arguments

`context`	(character) vector of texts - `context` variable in get_context output
`pre_trained`	(numeric) a F x D matrix corresponding to pretrained embeddings. F = number of features and D = embedding dimensions. rownames(pre_trained) = set of features for which there is a pre-trained embedding.
`transform`	(logical) - if TRUE (default) apply the a la carte transformation, if FALSE ouput untransformed averaged embedding.
`transform_matrix`	(numeric) a D x D 'a la carte' transformation matrix. D = dimensions of pretrained embeddings.
`N`	(numeric) number of most "prototypical" contexts to return.
`norm`	(character) - how to compute similarity (see ?text2vec::sim2): `"l2"` cosine similarity `"none"` inner product

Value

a data.frame with the following columns:

doc_id: (integer) document id.
typicality_score: (numeric) average similarity score to all other contexts
context: (character) contexts

Examples


# find contexts of immigration
context_immigration <- get_context(x = cr_sample_corpus, target = 'immigration',
                                   window = 6, valuetype = "fixed", case_insensitive = TRUE,
                                   hard_cut = FALSE, verbose = FALSE)

# identify top N prototypical contexts and compute typicality score
pt_context <- prototypical_context(context = context_immigration$context,
pre_trained = cr_glove_subset,
transform = TRUE,
transform_matrix = cr_transform,
N = 3, norm = 'l2')

conText documentation built on Feb. 16, 2023, 7:32 p.m.