concord_others: Simple concordance function
In gederajeg/corplingr: Tidy Concordances, Collocates, and Wordlist

Description Usage Arguments Value Examples

View source: R/corplingr_concord_others.R

The function generates a tidy concordance for a search pattern in a (set of) corpus (files). The function requires the corpus file(s) loaded and ready in the console as a vector of text with more than one line of texts/sentences. Each line should not correspond to one sentence. See Examples below for details.

concord_others(
  corpus_vector = "character vector of text loaded/read into console",
  pattern = "regular expressions",
  to_lower_corpus = TRUE,
  case_insensitive = TRUE,
  context_char = 50
)

`corpus_vector`	the vector of corpus texts.
`pattern`	regular expressions for the search pattern.
`to_lower_corpus`	whether to lowercase the corpus (`TRUE` – the default) first or leave it as is (`FALSE`).
`case_insensitive`	whether to ignore the case for the search `pattern` argument (`TRUE` – the default) or not (`FALSE`).
`context_char`	integer vector for the specified number of character as context to the left and right of the node pattern.

A tibble/data frame for the concordance match with LEFT and RIGHT contexts.

## Not run: 
# Load or read in the corpus data
# "load" approach
my_corpus_data <- "/Your/Path/To/Corpus.RData"
load(my_corpus_data)

# "read" approach
my_corpus_path <- "/Your/Path/To/Corpus.txt"
corp <- readr::read_lines(my_corpus_path)

# Inspect the first two elements.
head(corp, 2)
[1] "Hari yang panas itu berangsur-angsur menjadi dingin, karena matahari,
     raja siang itu, akan masuk ke dalam peraduannya, ke balik Gunung Sibualbuali,
     yang menjadi watas dataran tinggi Sipirok yang bagus itu."
[2] "Langit di sebelah barat pun merah kuning rupanya, dan sinar matahari
     yang turun itu nampaklah di atas puncak kayu yang tinggi-tinggi, indah
     rupanya, sebagai disepuh dengan emas juwita."

# OPTIONAL
# Trim down leading and trailing white space
# with str_trim from the stringr package
corp <- stringr::str_trim(corp)
# remove excessive white space in the text into just one space
corp <- stringr::str_replace_all(corp, "\\s{2,}", " ")


# get concordance for a pattern
concordance <- concord_others(corpus_vector = corp,
                                   pattern = "\\bmemandang\\b",
                                   to_lower_corpus = TRUE,
                                   case_insensitive = TRUE,
                                   context_char = 100)

# check the output
str(concordance)
head(concordance)

# save the output as tab-separated text file
# it can be opened in a spreadsheet software for further annotation
readr::write_delim(concordance,
                   path = "/Users/Primahadi/Desktop/my_concordance.txt",
                   delim = "\t")

## End(Not run)