My goals in scraping my twitter feed are:
The plan is for each type of archiving to have a separate Rscript
that runs each one, so that each can be controlled by a separate cron
job and they not step on each others toes. In addition, the actual archive files will be separate, so that we don't have to worry about access control, writing at the same time, etc.
The twitter 1.1 api states that an application can make 15 requests every 15 minutes. So in archiving my own feed, I don't imagine making requests more than once an hour. Favorites can probably be archived morning and night (every 12 hours), and hashtags can probably also be done once an hour (or less) depending on the hashtag.
I want to maintain an archive of my own feed from here on out. But not just my own tweets, but my actual user stream.
archiveLoc <- "C:/Users/rmflight/Documents/coding_projects/socialmediaArchive/twitter" library(twitteR) load("keys/twitteRCredentials.RData") registerTwitterOAuth(twitterCredentials)
homeData <- file.path(archiveLoc, "homeData.RData") homeText <- file.path(archiveLoc, "homeText.txt") load(homeData) allID <- as.numeric(sapply(currTweets, function(x){x$id})) lastID <- as.character(max(allID)) newTweets <- homeTimeline(n=100, sinceID=lastID) currTweets <- c(currTweets, newTweets) save(currTweets, file=homeData) textTweets <- lapply(newTweets, textTweet) textTweets <- paste(textTweets, sep="", collapse="\n") cat(textTweets, file=homeText, append=TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.