Description Usage Arguments Value References Examples
word_pairs
searches for the occurrences of a pair of words in sentences. These words can be separated by intervening strings (viz. other in-between words).
1 2 | word_pairs(corpus, word_1 = NULL, word_2 = NULL,
min_intervening = 0L, max_intervening = 3L)
|
corpus |
A character vector of sentences. |
word_1 |
A regular expressions for the first word. The regex must enclose the word with word boundary character (i.e. |
word_2 |
A regular expressions for the second word. The regex must enclose the word with word boundary character (i.e. |
min_intervening |
Number of minimum occurrence of the intervening word.
The default is |
max_intervening |
Number of minimum occurrence of the intervening word.
The default is |
A list object with the following elements:
pattern
: the extracted pattern spanning from the first word to the second word.
pattern_tagged
: the version of pattern
containing tags for the first and the second word.
matches
: the sentence matches containing the word pairs that are tagged for the first and the second word.
Rajeg, Gede Primahadi Wijaya. (2018). wordpairs: An R package to retrieve word pair in sentences of the (Indonesian) Leipzig Corpora.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # co-occurrence of *me-X-kan* transitive verbs with *kepada*
word_1 <- "\\bmen[a-z]{3,}kan\\b"
word_2 <- "\\bkepada\\b"
corpus <- my_leipzig_sample
m <- word_pairs(corpus,
word_1 = word_1,
word_2 = word_2,
min_intervening = 0L,
max_intervening = 3L)
# inspect the snippet of the results
head(m$pattern)
head(m$pattern_tagged)
# generate frequency table for the patterns
freq_tb <- table(m$pattern_tagged)
# sort in decreasing order of frequency
head(sort(freq_tb, decreasing = TRUE))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.