tokens2sequences: [Experimental] Convert quanteda tokens to keras sequences

View source: R/tokens2sequences.R

tokens2sequencesR Documentation

[Experimental] Convert quanteda tokens to keras sequences

Description

This function converts a quanteda quanteda::tokens() object into a tokens sequence object as expected by some functions in the keras package.

Usage

tokens2sequences(x, maxsenlen = 100, keepn = NULL)

Arguments

x

quanteda::tokens() object

maxsenlen

the maximum sentence length kept in output matrix

keepn

the maximum number of features to keep

Value

tokens2sequences() The output matrix has a number of rows which represent each tokenized sentence input into the function and a number of columns determined by maxsenlen. The matrix contains a numeric code for every unique token kept (determined by keepn) and they are arranged in the same sequence indicated by the original quanteda::tokens() object.

See Also

is.tokens2sequences(), tokens2sequences_conform()

Examples

library("quanteda")
corp <- corpus_subset(data_corpus_inaugural, Year <= 1793)
corptok <- tokens(corp)
print(corp)
seqs <- tokens2sequences(corptok, maxsenlen = 200)
print(seqs)

quanteda/quanteda.classifiers documentation built on Oct. 20, 2023, 6:53 a.m.