ng20: Dataset with news messages and the news groups they belong to

Description Usage Format Source Examples

Description

Multilabel dataset from the text domain. The original name of the dataset is 20ng

Usage

1

Format

An mldr object with 19300 instances, 1006 attributes and 20 labels

Source

Ken Lang, "Newsweeder: Learning to filter netnews", in Proc. 12th International Conference on Machine Learning, pp. 331-339, 1995

Examples

1
2
3
4
5
## Not run: 
toBibtex(ng20)
ng20$measures

## End(Not run)

Example output

sh: 1: cannot create /dev/null: Permission denied

Attaching package:mldr.datasetsThe following object is masked frompackage:stats:

    density

[1] "@inproceedings{,\n  author = \"Ken Lang\",\n  title = \"Newsweeder: Learning to filter netnews\",\n  booktitle = \"Proc. 12th International Conference on Machine Learning\",\n  pages = \"331--339\",\n  year = \"1995\"\n}"
$num.attributes
[1] 1026

$num.instances
[1] 19300

$num.inputs
[1] 1006

$num.labels
[1] 20

$num.labelsets
[1] 55

$num.single.labelsets
[1] 17

$max.frequency
[1] 997

$cardinality
[1] 1.02886

$density
[1] 0.05144301

$meanIR
[1] 1.007271

$scumble
[1] 1.317755e-07

$scumble.cv
[1] 22.00266

$tcs
[1] 13.9168

mldr.datasets documentation built on May 2, 2019, 3:43 p.m.