text_token: Tokenizing text

Description Usage Arguments Value Author(s) See Also Examples

View source: R/ttgsea.R

Description

An n-gram is used for tokenization. This function can also be used to limit the total number of tokens.

Usage

1
text_token(text, ngram_min = 1, ngram_max = 1, num_tokens)

Arguments

text

text data

ngram_min

minimum size of an n-gram (default: 1)

ngram_max

maximum size of an n-gram (default: 1)

num_tokens

maximum number of tokens

Value

token

result of tokenizing text

ngram_min

minimum size of an n-gram

ngram_max

maximum size of an n-gram

Author(s)

Dongmin Jung

See Also

tm::removeWords, stopwords::stopwords, textstem::lemmatize_strings, text2vec::create_vocabulary, text2vec::prune_vocabulary

Examples

1
2
3
4
5
6
7
8
9
library(fgsea)
data(examplePathways)
data(exampleRanks)
names(examplePathways) <- gsub("_", " ",
                          substr(names(examplePathways), 9, 1000))
set.seed(1)
fgseaRes <- fgsea(examplePathways, exampleRanks)
tokens <- text_token(data.frame(fgseaRes)[,"pathway"],
          num_tokens = 1000)

dongminjung/ttgsea documentation built on Dec. 30, 2021, 8:51 a.m.