View source: R/text_util_fun.R
| text_to_sentences | R Documentation | 
x into sentences.text_to_sentences splits text x 
(consisting of one or more character strings) 
into a vector of its constituting sentences.
text_to_sentences(
  x,
  sep = " ",
  split_delim = "\\.|\\?|!",
  force_delim = FALSE
)
x | 
 A string of text (required), typically a character vector.  | 
sep | 
 A character inserted as separator/delimiter 
between elements when collapsing multi-element strings of   | 
split_delim | 
 Sentence delimiters (as regex) 
used to split the collapsed string of   | 
force_delim | 
 Boolean: Enforce splitting at   | 
The splits of x will occur at given punctuation marks 
(provided as a regular expression, default: split_delim = "\.|\?|!").   
Empty leading and trailing spaces are removed before returning 
a vector of the remaining character sequences (i.e., the sentences).
The Boolean argument force_delim distinguishes between 
two splitting modes: 
 If force_delim = FALSE (as per default), 
a standard sentence-splitting pattern is assumed: 
A sentence delimiter in split_delim must be followed by 
one or more blank spaces and a capital letter starting the next sentence. 
Sentence delimiters in split_delim are not removed 
from the output.
 If force_delim = TRUE, 
the function enforces splits at each delimiter in split_delim. 
For instance, any dot (i.e., the metacharacter "\.") is  
interpreted as a full stop, so that sentences containing dots 
mid-sentence (e.g., for abbreviations, etc.) are split into parts. 
Sentence delimiters in split_delim are removed 
from the output.
Internally, text_to_sentences first uses paste 
to collapse strings (adding sep between elements) and then 
strsplit to split strings at split_delim.
A character vector (of sentences).
text_to_words for splitting text into a vector of words; 
text_to_chars for splitting text into a vector of characters; 
count_words for counting the frequency of words; 
strsplit for splitting strings.
Other text objects and functions: 
Umlaut,
capitalize(),
caseflip(),
cclass,
chars_to_text(),
collapse_chars(),
count_chars_words(),
count_chars(),
count_words(),
invert_rules(),
l33t_rul35,
map_text_chars(),
map_text_coord(),
map_text_regex(),
metachar,
read_ascii(),
text_to_chars(),
text_to_words(),
transl33t(),
words_to_text()
x <- c("A first sentence. Exclamation sentence!", 
       "Any questions? But etc. can be tricky. A fourth --- and final --- sentence.")
text_to_sentences(x)
text_to_sentences(x, force_delim = TRUE)
# Changing split delimiters:
text_to_sentences(x, split_delim = "\\.")  # only split at "."
text_to_sentences("Buy apples, berries, and coconuts.")
text_to_sentences("Buy apples, berries; and coconuts.", 
                  split_delim = ",|;|\\.", force_delim = TRUE)
                  
text_to_sentences(c("123. 456? 789! 007 etc."), force_delim = TRUE)
# Split multi-element strings (w/o punctuation):
e3 <- c("12", "34", "56")
text_to_sentences(e3, sep = " ")  # Default: Collapse strings adding 1 space, but: 
text_to_sentences(e3, sep = ".", force_delim = TRUE)  # insert sep and force split.
# Punctuation within sentences:
text_to_sentences("Dr. who is left intact.")
text_to_sentences("Dr. Who is problematic.")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.