README.md

disqusR

disqusR is a wrapper around the public disqus API version 3 and allows reading threads from disqus forums into R.

Among the many companies that use or used disqus are for instance news sites politico.com^x, wired.com, cnn.com^x and welt.de^x (^x marks companies that appear to no longer use disqus). Lately it looks like many of these have dropped disqus as of today only wired.com looks like it is still using disqus.

We developed disqusRsome time ago as a fun project to play around with json and playing with word clouds in R. Development was ceased after some initial functions were working. Since then it was rotting in a non public git of ours.

To be able to use disqusR one has to register at disqus.com. After registration one has to create an app in disqus so that you will retrieve a public and a secret key. Both are required to use disqusR. If pubkey and seckey are set, one can start parsing comments. Be aware that there is an upper limit to the number of requests one can make to disqus.com.

A few examples are given in the help files.

library(disqusR)

# insert pubkey and seckey (or parse from file)
pubkey <- "..."
seckey <- "..."

# use the websites link
art <- "http://www.rollingstone.com/music/lists/50-greatest-live-albums-of-all-time-20150429"
arts <- threads("list" , forum="rollingstone", thread=paste0("link:", art))

# fetch 5 comments
postslist <- posts("list", thread=arts$id, limit = 5)

# select 100 items after a certain timestamp
unixtime <- as.numeric(as.POSIXct("2017-05-31", type="%Y-%m-%d"))
trds <- threads("list", forum="rollingstone", since=unixtime, limit = 100)

str(trds)

Since you can read a maximum of 100 entries, lets have a look how to read more than 100 comments of a post.

# use the websites link (may work if link was not changed)
art <- "http://www.rollingstone.com/music/lists/50-greatest-live-albums-of-all-time-20150429"
arts <- threads("list" , forum="rollingstone", thread=paste0("link:", art))

postslist <- posts("list", thread=arts$id, limit = 5)

# ascending
postslist <- posts("list", thread=arts$id, order = "asc", limit = 5)

postslist$author.name

pl1 <- postslist$raw_message


# last timestamp in iso form
time <- postslist$createdAt[5]


postslist <- posts("list", thread=arts$id, order = "asc", limit = 5,
                   start = time)

# should start with the last name
postslist$author.name

pl2 <- postslist$raw_message[2:5]

p <- c(pl1, pl2)
p


JanMarvin/disqusR documentation built on May 6, 2019, 10:11 a.m.