View source: R/tokens2sequences.R
tokens2sequences | R Documentation |
This function converts a quanteda quanteda::tokens()
object
into a tokens sequence object as expected by some functions in the
keras package.
tokens2sequences(x, maxsenlen = 100, keepn = NULL)
x |
|
maxsenlen |
the maximum sentence length kept in output matrix |
keepn |
the maximum number of features to keep |
tokens2sequences()
The output matrix has a number of rows
which represent each tokenized sentence input into the function and a
number of columns determined by maxsenlen
. The matrix contains a
numeric code for every unique token kept (determined by keepn
) and
they are arranged in the same sequence indicated by the original
quanteda::tokens()
object.
is.tokens2sequences()
, tokens2sequences_conform()
library("quanteda")
corp <- corpus_subset(data_corpus_inaugural, Year <= 1793)
corptok <- tokens(corp)
print(corp)
seqs <- tokens2sequences(corptok, maxsenlen = 200)
print(seqs)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.