data_corpus_moviereviews | R Documentation |
A corpus object containing 2,000 movie reviews classified by positive or negative sentiment.
data_corpus_moviereviews
The corpus includes the following document variables:
factor indicating whether a review was manually classified as
positive pos
or negative neg
.
Character counting the position in the corpus.
Random number for each review.
For more information, see cat(meta(data_corpus_moviereviews, "readme"))
.
https://www.cs.cornell.edu/people/pabo/movie-review-data/
Pang, B., Lee, L. (2004) "A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts.", Proceedings of the ACL.
# check polarities
table(data_corpus_moviereviews$sentiment)
# make the data into sentences, because each line is a sentence
data_corpus_moviereviewsents <-
quanteda::corpus_segment(data_corpus_moviereviews, "\n", extract_pattern = FALSE)
print(data_corpus_moviereviewsents, max_ndoc = 3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.