token_eyes: Tokenization and sentiment analysis.

Description Usage Arguments Examples

View source: R/wordly_functions.R

Description

Allows for creating one-word tokens, with options for sentiment and stop-word application. A wrapper for tokenizers::tokenize_words().

Usage

1
2
3
4
token_eyes(df, text_col_name = NULL, stop_word_src = NULL,
  sentiment_src = NULL, to_lower = TRUE, to_strip_punct = TRUE,
  to_strip_numeric = FALSE, remove_empty_tokens = TRUE,
  show_verbose = FALSE)

Arguments

df

An input data.frame or tibble.

text_col_name

The name of the column containing text to be tokenized.

stop_word_src

The stop-word source. Either a vector of custom stopwords, or one of pre-built "snowball", "stopwords-iso", "misc", or "smart".

sentiment_src

The sentiment source. Ex: "nrc".

to_lower

Should tokens be forced to lowercase? Defaults to TRUE.

to_strip_punct

Should punctuation be removed prior to tokenization? Defaults to TRUE.

to_strip_numeric

Should numeric values be removed prior to tokenization? Defaults to FALSE.

remove_empty_tokens

Should empty-space tokens be removed after tokenization? Defaults to TRUE.

show_verbose

Should verbose be applied? Defaults to FALSE.

Examples

1
2
3
4
ori_dat <- data.frame(doc_main = rep(c("Book_A", "Book_B", "Book_C"), each = 10), doc_sub = rep(c("Chp_1", "Chp_2"), each = 5), doc_line = rep(1:10, 3), doc_text = stringr::sentences[1:30], stringsAsFactors = FALSE)
ret_nsns <- ori_dat %>% token_eyes("doc_text")   # no stopwords, no sentiment
ret_ysns <- ori_dat %>% token_eyes(text_col_name = "doc_text", stop_word_src = "smart" )   # yes stopwords, no sentiment
ret_ysys <- ori_dat %>% token_eyes(text_col_name = "doc_text", stop_word_src = "smart", sentiment_src = "nrc" )   # yes stopwords, yes sentiment

tomathon-io/wordly documentation built on June 15, 2020, 12:41 a.m.