reuters_dt: REUTERS-21578 dataset

Description Usage Format References

Description

The package includes different representations of an excerpt from the REUTERS-21578 dataset as sample data. The REUTERS corpus is widely used as sample data for text classification tasks (Silva, Ribeiro 2010). The data here is taken from the tm package. See files in the 'data-raw' folder of the package how the sample data has been prepared.

Usage

1

Format

A data.table with two 2 and 20 rows:

References

Catarina Silva, Bernardete Ribeiro (2010) Inductive Inference for Large Scale Text Classification. Kernel Approaches and Techniques, Springer: Berlin, pp. 129ff.


PolMine/bignlp documentation built on Jan. 29, 2021, 1:14 a.m.