split_sentence_token: Split Sentences & Tokens
In trinker/textshape: Tools for Reshaping Text

split_sentence_token

R Documentation

Split Sentences & Tokens

Description

Split sentences and tokens.

Usage

split_sentence_token(x, ...)

## Default S3 method:
split_sentence_token(x, lower = TRUE, ...)

## S3 method for class 'data.frame'
split_sentence_token(x, text.var = TRUE, lower = TRUE, ...)

Arguments

`x`	A `data.frame` or character vector with sentences.
`lower`	logical. If `TRUE` the words are converted to lower case.
`text.var`	The name of the text variable. If `TRUE` `split_sentence_token` tries to detect the column with sentences.
`...`	Ignored.

Value

Returns a list of vectors of sentences or a expanded data.frame with sentences split apart.

Examples

(x <- c(paste0(
    "Mr. Brown comes! He says hello. i give him coffee.  i will ",
    "go at 5 p. m. eastern time.  Or somewhere in between!go there"
),
paste0(
    "Marvin K. Mooney Will You Please Go Now!", "The time has come.",
    "The time has come. The time is now. Just go. Go. GO!",
    "I don't care how."
)))
split_sentence_token(x)

data(DATA)
split_sentence_token(DATA)

## Not run: 
## Kevin S. Dias' sentence boundary disambiguation test set
data(golden_rules)
library(magrittr)

golden_rules %$%
    split_sentence_token(Text)

## End(Not run)

trinker/textshape documentation built on April 5, 2024, 11:39 a.m.