CNAE2: CNAE dataset on classes 4 and 9

Description Usage Format Source Examples

Description

The data set CNAE2 is a subset of the original CNAE-9 data, that comprises 1080 documents categorized into 9 topics of free text business descriptions of Brazilian companies.

Specifically, CNAE2 contains only the documents belonging to topics "4" and "9". The data set is already pre-processed and provides the bag-of-words representation of the documents; the columns with null counts are removed leading to a matrix with 240 documents on a vocabulary with cardinality 357. This data set is highly sparse (98

Class labels are stored in cl_CNAE

Usage

1

Format

A matrix for the bag-of-words representation of the CNAE2 dataset.

Source

Original CNAE9 dataset

Examples

1
2
x = data(CNAE2)
print(head(x))

deepMOU documentation built on March 4, 2021, 9:09 a.m.

Related to CNAE2 in deepMOU...