restore_paragraphs: Restore paragraphs

Description Usage Arguments Details Examples

View source: R/utils.R

Description

Restore paragraphs in a character vector.

Usage

1
2
restore_paragraphs(x, skipRegexCurrent = "^\\s*[•A-Z(]",
  skipRegexPrevious = "[\\.?!)]\\s*$")

Arguments

x

a character vector

skipRegexCurrent

a regex

skipRegexPrevious

another regex

Details

Reconstruct paragraphs from a character vector with line breaks and word-wraps. The heuristic is as follows: If a line ends with a hyphenation and the next line starts with a small letter, remove hyphen and concatenate word."

Examples

1
2
3
4
5
6
vec <- c(
  "This is a sample text. We freq-",
  "quently encounter issues with bro-",
  "ken lines."
  )
restore_paragraphs(vec)

PolMine/trickypdf documentation built on Nov. 20, 2019, 8:01 p.m.