clean_tokens: Tokenize words, and remove stopwords from corpus

Description Usage Arguments Value Examples

View source: R/coRPysprofiling.R

Description

Tokenize words, and remove stopwords from corpus

Usage

1
clean_tokens(corpus, ignore = stopwords::stopwords("en"))

Arguments

corpus

character vector representing a corpus

ignore

stopwords to ignore, optional (default: common English words and punctuations)

Value

character vector of word tokens

Examples

1
2
coRPysprofiling::clean_tokens("How many species of animals are there in Russia?")
coRPysprofiling::clean_tokens("How many species of animals are there in Russia?", ignore='!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~')

UBC-MDS/coRPysprofiling-R documentation built on March 30, 2021, 12:02 p.m.