get_sentences: Function extracting sentences form raw text
In mmochtak/sentenceR: Langauge-agnostic setence tokenizer with UDPipe back end

get_sentences

R Documentation

Function extracting sentences form raw text

This function allows tokenizing text on the level of sentences.

get_sentences(
  text,
  language,
  lem = FALSE,
  remove_no = FALSE,
  remove_punct = FALSE,
  tolower = FALSE,
  verbose = FALSE,
  n_cores = 1
)

`text`	Vector of strings that is going to be tokenized.
`language`	Language model that is used for tokenization. See language models at https://github.com/bnosac/udpipe.
`lem`	Logical parameter for extracting also lemmatized version of a sentence. Default is FALSE.
`remove_no`	Logical parameter for removing numbers. Default is FALSE.
`remove_punct`	Logical parameter for removing punctuation. Default is FALSE.
`tolower`	Logical parameter for transforming strings to lower case. Default is FALSE.
`verbose`	Logical parameter for displaying extended information on processed data. Works only with processing on one core. Default is FALSE.
`n_cores`	Numeric parameter for number of cores to be used for processing. Default is 1 core.