This function takes a dataframe of raw tweets and performs some basic cleaning and tokenization. It returns the input data.frame, now with a new column for clean_text, the tweets after cleaning. It also returns the emojis in the tweets in their own column, and a count of emojis used in each tweet, for convenience.
1 2 3 |
remove.mentions |
TRUE by default, controls whether to remove mentions. Can be set to FALSE to keep them. |
remove.hashtags |
TRUE by default, controls whether to remove hashtags Can be set to FALSE to keep them. |
remove.urls |
TRUE by default, controls whether to remove urls. Can be set to FALSE to keep them. |
remove.retweets |
TRUE by default, controls whether to remove retweets. Can be set to FALSE to keep them. |
remove.numbers |
FALSE by default, controls whether to remove numbers Can be set to TRUE to remove them. |
lowercase |
TRUE by default, controls whether to convert all characters to lowercase. Can be set to FALSE to retain case. |
tweets |
An input dataset of raw tweets, usually from search_tweets() |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.