unnest_sentences_: Split a column of text into sentences

Description Usage Arguments Value Examples

Description

Split a column of text into sentences

Usage

1
2
3
4
5
unnest_sentences_(tbl, output, input, doc_id = NULL,
  output_id = "sent_id", drop = TRUE)

unnest_sentences(tbl, output, input, doc_id = NULL,
  output_id = "sent_id", drop = TRUE)

Arguments

tbl

dataframe containing column of text to be split into sentences

output

name of column to be created to store parsed sentences

input

name of input column of text to be parsed into sentences

doc_id

column of document ids; if not provided it will be assumed that each row is a different document

output_id

name of column to be created to store sentence ids

drop

whether original input column should get dropped

Value

A data.frame of parsed sentences and sentence ids

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
df <- data.frame(doc_id = 1:3, 
                 text = c("Testing the system. Second sentence for you.", 
                          "System testing the tidy documents df.", 
                          "Documents will be parsed and lexranked."),
                 stringsAsFactors=FALSE)

unnest_sentences(df, sents, text)
unnest_sentences_(df, "sents", "text")

## Not run: 
library(magrittr)

df %>% 
  unnest_sentences(sents, text)

## End(Not run)

lexRankr documentation built on May 2, 2019, 1:29 p.m.