knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

Travis build status Coverage status

socnet

This R package is created to access the data available in the SOCNET website https://lists.ufl.edu/cgi-bin/wa?A0=SOCNET, which is hosted by The University of Florida in its Listserv website.

Installation

This package is currently under develoment and is only available by downloading the bleeding edge version. You can use devtools to get it:

devtools::install_github("USCCANA/socnet")

Example

Before starts, let's first load the package.

library(socnet)

Suppose that you want to look at the SOCNET archives, but you don't know from where to start. You can use the function socnet_list_archives to get a list of the archives that are available in the Listserv.

# Getting the URLs to the archives per month
archives <- socnet_list_archives(cached = TRUE)
head(archives)

Now that we have the list of archives, we can access one of them and list what are the subjects (emails) that show under that archive with the socnet_list_subjects function.

# What was discussed during Oct 17?: Getting the subjects during that time
subjects <- socnet_list_subjects(archives$url[1], cached = TRUE)

Let's take a look at the output

str(subjects)
head(subjects[,-1])

Now, we can use the function socnet_parse_subject to actually get the data of a particular subject. Let's try with the subject titled r subjects$subject[1]

socnet_parse_subject(subjects$url[1])

As you can see, the function returned a list with two elements, a vector of meta information, and the actual email.

Most active user (compose side)

rankfun <- function(x, colnames, maxn = 100) {
  x <- as.data.frame(table(x))
  x <- x[order(-x$Freq),]
  dimnames(x) <- list(1:nrow(x), colnames)
  knitr::kable(x[1:maxn,], row.names = TRUE)  
}

# Getting the from column and removing weird characters
data("subjects")
from <- subjects$from
from <- iconv(from, to="ASCII//TRANSLIT")

# Removing <[log in to unmask]> message
from <- tolower(gsub("[<].+", "", from))

# Fixing some names...
regexp <- "Th?om(as)?( W)?\\.? Valente"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Thomas W. Valente"

regexp <- "Valdis( Krebs)?"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Valdis Krebs"

regexp <- "Steve Borgatti|Borgatti, Steve"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Steve Borgatti"

regexp <- "Snijders, T\\.A\\.B\\.|Tom A\\.B\\. Snijders|T\\.A\\.B\\.Snijders"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Tom Snijders"

regexp <- "Kathleen( M\\.)? Carley"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Kathleen M. Carley"

# Capitalizing the first letter
# I learned (copied) this from stackoverflow!
# https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string
# from <- gsub("(^|[[:space:]])([[:alpha:]])", "\\1\\U\\2", from, perl=TRUE)
from <- stringr::str_to_title(from)


# Creating the table
rankfun(from, colnames=c("User", "Count"))

Latest version of the cache data

readLines("inst/cache/readme.md", warn = FALSE)


USCCANA/socnet documentation built on Aug. 17, 2022, 8:42 a.m.