data_corpus_LMRD: Large Movie Review Dataset from Maas et. al. (2011)

data_corpus_LMRDR Documentation

Large Movie Review Dataset from Maas et. al. (2011)

Description

A corpus object containing a dataset for sentiment classification containing 25,000 highly polar movie reviews for training, and 25,000 for testing, from Maas et. al. (2011).

Usage

data_corpus_LMRD

Format

The corpus docvars consist of:

docnumber

serial (within set and polarity) document number

rating

user-assigned movie rating on a 1-10 point integer scale

set

used for test v. training set

polarity

either neg or pos to indicate whether the movie review was negative or positive. See Maas et al (2011) for the cut-off values that governed this assignment.

Source

http://ai.stanford.edu/~amaas/data/sentiment/

References

Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011). "Learning Word Vectors for Sentiment Analysis". The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011).


quanteda/quanteda.classifiers documentation built on Oct. 20, 2023, 6:53 a.m.