word_pairs: Detect word pairs in text

Description Usage Arguments Value References Examples

View source: R/word_pairs.R

Description

word_pairs searches for the occurrences of a pair of words in sentences. These words can be separated by intervening strings (viz. other in-between words).

Usage

1
2
word_pairs(corpus, word_1 = NULL, word_2 = NULL,
  min_intervening = 0L, max_intervening = 3L)

Arguments

corpus

A character vector of sentences.

word_1

A regular expressions for the first word. The regex must enclose the word with word boundary character (i.e. "\\b").

word_2

A regular expressions for the second word. The regex must enclose the word with word boundary character (i.e. "\\b").

min_intervening

Number of minimum occurrence of the intervening word. The default is 0L.

max_intervening

Number of minimum occurrence of the intervening word. The default is 3L. Use Inf to get infinite intervening words after word_1 and before the occurrence of word_2.

Value

A list object with the following elements:

References

Rajeg, Gede Primahadi Wijaya. (2018). wordpairs: An R package to retrieve word pair in sentences of the (Indonesian) Leipzig Corpora.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# co-occurrence of *me-X-kan* transitive verbs with *kepada*
word_1 <- "\\bmen[a-z]{3,}kan\\b"
word_2 <- "\\bkepada\\b"
corpus <- my_leipzig_sample
m <- word_pairs(corpus,
                word_1 = word_1,
                word_2 = word_2,
                min_intervening = 0L,
                max_intervening = 3L)

# inspect the snippet of the results
head(m$pattern)
head(m$pattern_tagged)

# generate frequency table for the patterns
freq_tb <- table(m$pattern_tagged)

# sort in decreasing order of frequency
head(sort(freq_tb, decreasing = TRUE))

gederajeg/wordpairs documentation built on May 23, 2019, 2:46 p.m.