freqlist_leipzig_each: Generate word-/regex-specific frequency list from the Leipzig...

Description Usage Arguments Value Examples

View source: R/corplingr_freqlist_leipzig_each.R

Description

The function generates a tibble of token-count for a particular word(s)/regex(es) for each supplied Leipzig corpus file.

Usage

1
2
3
4
5
freqlist_leipzig_each(
  pattern = NULL,
  leipzig_path = "(full) filepath to a (set of) Leipzig corpus files",
  case_insensitive = TRUE
)

Arguments

pattern

the regular expressions/exact patterns for the target pattern/word whose frequency in a (set of) Leipzig Corpus file(s) you want to generate.

leipzig_path

gives the (i) file names of the corpus if they are in the working directory, or (ii) the complete file path to each of the Leipzig.

case_insensitive

logical; whether case differences should be ignored (TRUE – the default) or not (FALSE).

Value

a tibble with three columns (i) match, (ii) corpus_id, and (iii) n, which is the count/token.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
## Not run: 
# prepare the input
regex <- "\\bmemberi(kan)?\\b"
corpus.path <- leipzig_file_path[1:2]

# generate the frequency count
freqlist_leipzig_each(pattern = regex,
                leipzig_path = corpus.path,
                case_insensitive = TRUE)

## End(Not run)

gederajeg/corplingr documentation built on Dec. 20, 2021, 9:50 a.m.