clean_tweets: Cleans tweet data frames

View source: R/parseTweetFiles.R

clean_tweetsR Documentation

Cleans tweet data frames

Description

Performs all necessary cleaning on a data frame of tweets. This includes removing all symbols from tweets, converting them to lower case, removing all stop words, and converting timestamps to an R usable format. Can also filter by time zone if desired (default does not filter)

Usage

clean_tweets(tweets.df, tz = NULL, stoplist = NULL)

Arguments

tweets.df

An array of tweets with desired variables attached. (Use dplyr to filter variables)

tz

A list of time zones to filter by, currently case sensitive

stoplist

The stoplist used to filter words

Value

The tweet data frame with all editing / filtering done. Empty dataset

Examples

## Not run: df = select(rawdata, text, time_zone)
## Not run: tweets = clean_tweets(dataframe)
## Not run: tweets = clean_tweets(dataframe, tz = c("Pacific Time (US & Canada)", "Eastern Time (US & Canada)),
stoplist = stoplist))
## End(Not run)

rturn/parseTweetFiles documentation built on July 31, 2023, 3:43 p.m.