concord_leipzig: Generate tidyverse-style concordances for the Leipzig Corpora

Description Usage Arguments Value Examples

View source: R/corplingr_concord_leipzig.R

Description

The function produces tibble-output concordances for Leipzig Corpora files.

Usage

1
concord_leipzig(leipzig_path = NULL, pattern = NULL, case_insensitive = TRUE)

Arguments

leipzig_path

character stringrs of (i) file names of the Leipzig corpus if they are already in the working directory, or (ii) the complete filepath to each of the Leipzig corpus files to be processed.

pattern

regular expressions/exact patterns for the target pattern.

case_insensitive

whether the search ignores case (TRUE – the default) or not (FALSE).

Value

A concordance-tibble consisting of (i) start and end character position of the pattern in the corpus; (ii) corpus file names and sentence IDs in which the pattern is found; (iii) left, node, and right concordance-style view; and (iv) node_sentences containing the full sentences with the search pattern replaced with "nodeword" string.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## Not run: 
# load the required packages
library(corplingr)

# 1. Generate concordance of a pattern from multiple corpus files
leipzig_corpus_path <- c("/Your/Path/to/Leipzig/corpora_1.txt",
"/Your/Path/to/Leipzig/corpora_2.txt")

concord <- concord_leipzig(leipzig_path = leipzig_corpus_path, pattern = "menjalani")
str(concord)


# 2. Combine with pipe "%>%" and other tidyverse suits!
library(dplyr)
library(readr)
concord_leipzig(leipzig_corpus_path, "menjalani") %>%

# retain only the concordance, corpus name and sentence id
select(-start, -end) %>%

write_delim(path = "my_concordance.txt", delim = "\t")

## End(Not run)

gederajeg/corplingr documentation built on Dec. 20, 2021, 9:50 a.m.