In nicholasjhorton/textclassificationexamples: Example Datasets and Functions for Text Classification MEAs

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(textclassificationexamples)
library(ggplot2)
library(rpart)

This package is intended to provide users with datasets and helper functions necessary to work through basic text classification and modeling examples. The vignette provides some additional introduction and background.

Spam

The Spam example lets students work with email subject lines from a model eliciting activity that was created by Joan Garfield and colleagues at the University of Minnesota (see https://serc.carleton.edu/sp/library/mea/index.html).

Clickbait

The clickbait example lets students work through a different classification problem using headlines from sites known to host clickbait and legitimate news sites. These headlines are derived from Kaggle (see XX).

The intent is to extract a variety of structural features from headlines associated with either valid news sources or clickbait articles to create a model/ classification scheme that accurately classifies headlines based on the clickbait variable (TRUE/FALSE).

There are three available datasets for use: articles, articles_clean, and sample_articles.

articles contains the entire dataset provided by the source (see documentation for more info). articles_clean is a marginally smaller subset of articles filtered (imperfectly) for headlines with explicit language, with sample_articles being a 2000 row sample of this subset. sample_articles contains 1000 rows of clickbait articles, and 1000 rows of news headlines. This small set can be converted to a testing and training set for modeling purposes.

head(sample_articles)

nicholasjhorton/textclassificationexamples documentation built on July 26, 2020, 12:42 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nicholasjhorton/textclassificationexamples
Example Datasets and Functions for Text Classification MEAs

In nicholasjhorton/textclassificationexamples: Example Datasets and Functions for Text Classification MEAs

Spam

Clickbait

R Package Documentation

Browse R Packages

We want your feedback!

nicholasjhorton/textclassificationexamples Example Datasets and Functions for Text Classification MEAs

In nicholasjhorton/textclassificationexamples: Example Datasets and Functions for Text Classification MEAs

Spam

Clickbait

R Package Documentation

Browse R Packages

We want your feedback!

nicholasjhorton/textclassificationexamples
Example Datasets and Functions for Text Classification MEAs