Description Usage Arguments Value Author(s) See Also Examples
A vectorization of words or tokens of text is necessary for machine learning. Vectorized sequences are padded or truncated.
1 | token_vector(text, token, length_seq)
|
text |
text data |
token |
result of tokenization (output of "text_token") |
length_seq |
length of input sequences |
sequences of integers
Dongmin Jung
tm::removeWords, stopwords::stopwords, textstem::lemmatize_strings, tokenizers::tokenize_ngrams, keras::pad_sequences
1 2 3 4 5 6 7 8 9 10 11 12 13 | library(reticulate)
if (keras::is_keras_available() & reticulate::py_available()) {
library(fgsea)
data(examplePathways)
data(exampleRanks)
names(examplePathways) <- gsub("_", " ",
substr(names(examplePathways), 9, 1000))
set.seed(1)
fgseaRes <- fgsea(examplePathways, exampleRanks)
tokens <- text_token(data.frame(fgseaRes)[,"pathway"],
num_tokens = 1000)
sequences <- token_vector("Cell Cycle", tokens, 10)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.