| sentSplit | R Documentation | 
sentSplit - Splits turns of talk into individual sentences (provided 
proper punctuation is used).  This procedure is usually done as part of the 
data read in and cleaning process.
sentCombine - Combines sentences by the same grouping variable together.
TOT - Convert the tot column from sentSplit to 
turn of talk index (no sub sentence).  Generally, for internal use.
sent_detect - Detect and split sentences on endmark boundaries.
sent_detect_nlp - Detect and split sentences on endmark boundaries 
using openNLP & NLP utilities which matches the onld version of
the openNLP package's now removed sentDetect function.
sentSplit(
  dataframe,
  text.var,
  rm.var = NULL,
  endmarks = c("?", ".", "!", "|"),
  incomplete.sub = TRUE,
  rm.bracket = TRUE,
  stem.col = FALSE,
  text.place = "right",
  verbose = is.global(2),
  ...
)
sentCombine(text.var, grouping.var = NULL, as.list = FALSE)
TOT(tot)
sent_detect(
  text.var,
  endmarks = c("?", ".", "!", "|"),
  incomplete.sub = TRUE,
  rm.bracket = TRUE,
  ...
)
sent_detect_nlp(text.var, ...)
| dataframe | A dataframe that contains the person and text variable. | 
| text.var | The text variable. | 
| rm.var | An optional character vector of 1 or 2 naming the variables that are repeated measures (This will restart the "tot" column). | 
| endmarks | A character vector of endmarks to split turns of talk into sentences. | 
| incomplete.sub | logical.  If  | 
| rm.bracket | logical.  If  | 
| stem.col | logical.  If  | 
| text.place | A character string giving placement location of the text 
column. This must be one of the strings  | 
| verbose | logical.  If  | 
| grouping.var | The grouping variables.  Default  | 
| as.list | logical.  If  | 
| tot | A tot column from a  | 
| ... | Additional options passed to  | 
sentSplit - returns a dataframe with turn of talk broken apart 
into sentences.  Optionally a stemmed version of the text variable may be 
returned as well.
sentCombine - returns a list of vectors with the continuous 
sentences by grouping.var pasted together. 
returned as well.
TOT - returns a numeric vector of the turns of talk without 
sentence sub indexing (e.g. 3.2 become 3).
sent_detect - returns a character vector of sentences split on
endmark.
sent_detect - returns a character vector of sentences split on
endmark.
sentSplit requires the dialogue (text) 
column to be cleaned in a particular way.  The data should contain qdap
punctuation marks (c("?", ".", "!", "|")) at the end of each sentence.
Additionally, extraneous punctuation such as abbreviations should be removed
(see replace_abbreviation).
Trailing sentences such as I thought I... will be treated as 
incomplete and marked with "|" to denote an incomplete/trailing 
sentence.
It is recommended that the user runs check_text on the 
output of sentSplit's text column.
Dason Kurkiewicz and Tyler Rinker <tyler.rinker@gmail.com>.
bracketX, 
incomplete_replace,
stem2df ,
TOT
## Not run: 
## `sentSplit` EXAMPLE:
(out <- sentSplit(DATA, "state"))
out %&% check_text()  ## check output text
sentSplit(DATA, "state", stem.col = TRUE)
sentSplit(DATA, "state", text.place = "left")
sentSplit(DATA, "state", text.place = "original")
sentSplit(raj, "dialogue")[1:20, ]
## plotting
plot(out)
plot(out, grouping.var = "person")
out2 <- sentSplit(DATA2, "state", rm.var = c("class", "day"))
plot(out2)
plot(out2, grouping.var = "person")
plot(out2, grouping.var = "person", rm.var = "day")
plot(out2, grouping.var = "person", rm.var = c("day", "class"))
## `sentCombine` EXAMPLE:
dat <- sentSplit(DATA, "state") 
sentCombine(dat$state, dat$person)
truncdf(sentCombine(dat$state, dat$sex), 50)
## `TOT` EXAMPLE:
dat <- sentSplit(DATA, "state") 
TOT(dat$tot)
## `sent_detect`
sent_detect(DATA$state)
## NLP based sentence splitting 
sent_detect_nlp(DATA$state)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.