textstat_context: Identify context words
In LSX: Semi-Supervised Algorithm for Document Scaling

textstat_context

R Documentation

Identify context words

Description

Identify context words using user-provided patterns.

Usage

textstat_context(
  x,
  pattern,
  valuetype = c("glob", "regex", "fixed"),
  case_insensitive = TRUE,
  window = 10,
  min_count = 10,
  remove_pattern = TRUE,
  n = 1,
  skip = 0,
  ...
)

char_context(
  x,
  pattern,
  valuetype = c("glob", "regex", "fixed"),
  case_insensitive = TRUE,
  window = 10,
  min_count = 10,
  remove_pattern = TRUE,
  p = 0.001,
  n = 1,
  skip = 0
)

Arguments

`x`	a tokens object created by `quanteda::tokens()`.
`pattern`	`quanteda::pattern()` to specify target words.
`valuetype`	the type of pattern matching: `"glob"` for "glob"-style wildcard expressions; `"regex"` for regular expressions; or `"fixed"` for exact matching. See `quanteda::valuetype()` for details.
`case_insensitive`	if `TRUE`, ignore case when matching.
`window`	size of window for collocation analysis.
`min_count`	minimum frequency of words within the window to be considered as collocations.
`remove_pattern`	if `TRUE`, keywords do not contain target words.
`n`	integer vector specifying the number of elements to be concatenated in each n-gram. Each element of this vector will define a `n` in the `n`-gram(s) that are produced.
`skip`	integer vector specifying the adjacency skip size for tokens forming the n-grams, default is 0 for only immediately neighbouring words. For `skipgrams`, `skip` can be a vector of integers, as the "classic" approach to forming skip-grams is to set skip = `k` where `k` is the distance for which `k` or fewer skips are used to construct the `n`-gram. Thus a "4-skip-n-gram" defined as `skip = 0:4` produces results that include 4 skips, 3 skips, 2 skips, 1 skip, and 0 skips (where 0 skips are typical n-grams formed from adjacent words). See Guthrie et al (2006).
`...`	additional arguments passed to `quanteda.textstats::textstat_keyness()`.
`p`	threshold for statistical significance of collocations.

LSX
Semi-Supervised Algorithm for Document Scaling

textstat_context: Identify context words
In LSX: Semi-Supervised Algorithm for Document Scaling

Identify context words

Description

Usage

Arguments

See Also

Related to textstat_context in LSX...

R Package Documentation

Browse R Packages

We want your feedback!

LSX Semi-Supervised Algorithm for Document Scaling

textstat_context: Identify context words In LSX: Semi-Supervised Algorithm for Document Scaling

Identify context words

Description

Usage

Arguments

See Also

Related to textstat_context in LSX...

R Package Documentation

Browse R Packages

We want your feedback!

LSX
Semi-Supervised Algorithm for Document Scaling

textstat_context: Identify context words
In LSX: Semi-Supervised Algorithm for Document Scaling