data_bookReviews: Amazon book reviews data

data_bookReviewsR Documentation

Amazon book reviews data

Description

This is a subset of the data used in the paper, which was assembled by Prettenhofer and Stein (2010). It contains 1000 reviews of books on Amazon, of which 500 were selected from the original training data and 500 from the test data.

The full dataset has been used for a variety of things, including classification using svm. The subset was chosen small enough to keep the computation time low, while still containing the examples in the paper.

Usage

data("data_bookReviews")

Format

A data frame with 1000 observations on the following 2 variables.

review

the review in text format (character)

sentiment

factor indicating the sentiment of the review: negative (1) or positive (2)

Source

Prettenhofer, P., Stein, B. (2010). Cross-language text classification using structural correspondence learning. Proceedings of the 48th annual meeting of the association for computational linguistics, 1118-1127.

Examples

data(data_bookReviews)
# Example review:
data_bookReviews[5, 1]

# The data are used in:
## Not run: 
vignette("Support_vector_machine_examples")

## End(Not run)

classmap documentation built on April 23, 2023, 5:09 p.m.