news_subset: Select a subset of the full 20Newsgroupds dataset

Description Usage Arguments Value

Description

Label names: comp.graphics comp.os.ms-windows.misc comp.sys.ibm.pc.hardware comp.sys.mac.hardware comp.windows.x rec.autos rec.motorcycles rec.sport.baseball rec.sport.hockey sci.crypt sci.electronics sci.med sci.space misc.forsale talk.politics.misc talk.politics.guns talk.politics.mideast talk.religion.misc alt.atheism soc.religion.christian

Usage

1
news_subset(X, filter, binary = TRUE, vocabulary = FALSE)

Arguments

X

The newsgroups data object loaded with data(newsgroups).

filter

Either an integer vector specifying the label numbers, or a character vector specifying the label names. Supports "starts with" partial matching for label names.

binary

Logical. Make the data values 1's

vocabulary

Logical. Include a word column in the returned data frame in list element one.

Value

List of 2. (1) The remapped data. (2) The distinct labels corresponding to each row.


CAMCOS/camcos2017 documentation built on May 6, 2019, 9:23 a.m.