bt_td_matrix_preprocess: Preprocess a document corpus into fixed-length vectors of...

View source: R/bt_td_matrix_preprocess.R

bt_td_matrix_preprocessR Documentation

Preprocess a document corpus into fixed-length vectors of integers, returned as a data.frame or matrix. Error thrown if you don't have dedicated nVidia GPU, this can be ignored.

Description

Requires Keras to work.

Usage

bt_td_matrix_preprocess(
  num_words = 15000,
  max_length = 100,
  text,
  tokeniser = NULL,
  as.df = T
)

Arguments

num_words

Desired size of vocabulary.

max_length

Desired length of each doc. Shorter will be chopped. Longer will be zero-padded.

text

The document corpus.

as.df

Do you want a dataframe? If false, a matrix is returned.

Value

sparse document TD matrix, as a dataframe, or matrix if as.df=F

References

www.globaltradealert.org


global-trade-alert/gtabastiat documentation built on June 4, 2023, 6:40 a.m.