In USCCANA/socnet: Web Scraping The Social Networks (SOCNET) Listserv

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

socnet

This R package is created to access the data available in the SOCNET website https://lists.ufl.edu/cgi-bin/wa?A0=SOCNET, which is hosted by The University of Florida in its Listserv website.

Installation

This package is currently under develoment and is only available by downloading the bleeding edge version. You can use devtools to get it:

devtools::install_github("USCCANA/socnet")

Example

Before starts, let's first load the package.

library(socnet)

Suppose that you want to look at the SOCNET archives, but you don't know from where to start. You can use the function socnet_list_archives to get a list of the archives that are available in the Listserv.

# Getting the URLs to the archives per month
archives <- socnet_list_archives(cached = TRUE)
head(archives)

Now that we have the list of archives, we can access one of them and list what are the subjects (emails) that show under that archive with the socnet_list_subjects function.

# What was discussed during Oct 17?: Getting the subjects during that time
subjects <- socnet_list_subjects(archives$url[1], cached = TRUE)

Let's take a look at the output

str(subjects)
head(subjects[,-1])

Now, we can use the function socnet_parse_subject to actually get the data of a particular subject. Let's try with the subject titled r subjects$subject[1]

socnet_parse_subject(subjects$url[1])

As you can see, the function returned a list with two elements, a vector of meta information, and the actual email.

Most active user (compose side)

rankfun <- function(x, colnames, maxn = 100) {
  x <- as.data.frame(table(x))
  x <- x[order(-x$Freq),]
  dimnames(x) <- list(1:nrow(x), colnames)
  knitr::kable(x[1:maxn,], row.names = TRUE)  
}

# Getting the from column and removing weird characters
data("subjects")
from <- subjects$from
from <- iconv(from, to="ASCII//TRANSLIT")

# Removing <[log in to unmask]> message
from <- tolower(gsub("[<].+", "", from))

# Fixing some names...
regexp <- "Th?om(as)?( W)?\\.? Valente"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Thomas W. Valente"

regexp <- "Valdis( Krebs)?"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Valdis Krebs"

regexp <- "Steve Borgatti|Borgatti, Steve"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Steve Borgatti"

regexp <- "Snijders, T\\.A\\.B\\.|Tom A\\.B\\. Snijders|T\\.A\\.B\\.Snijders"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Tom Snijders"

regexp <- "Kathleen( M\\.)? Carley"
from[grepl(regexp, from, ignore.case = TRUE)] <- "Kathleen M. Carley"

# Capitalizing the first letter
# I learned (copied) this from stackoverflow!
# https://stackoverflow.com/questions/6364783/capitalize-the-first-letter-of-both-words-in-a-two-word-string
# from <- gsub("(^|[[:space:]])([[:alpha:]])", "\\1\\U\\2", from, perl=TRUE)
from <- stringr::str_to_title(from)


# Creating the table
rankfun(from, colnames=c("User", "Count"))

Latest version of the cache data

readLines("inst/cache/readme.md", warn = FALSE)

USCCANA/socnet documentation built on Aug. 17, 2022, 8:42 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

USCCANA/socnet
Web Scraping The Social Networks (SOCNET) Listserv

In USCCANA/socnet: Web Scraping The Social Networks (SOCNET) Listserv

socnet

Installation

Example

Most active user (compose side)

Latest version of the cache data

R Package Documentation

Browse R Packages

We want your feedback!

USCCANA/socnet Web Scraping The Social Networks (SOCNET) Listserv

In USCCANA/socnet: Web Scraping The Social Networks (SOCNET) Listserv

socnet

Installation

Example

Most active user (compose side)

Latest version of the cache data

R Package Documentation

Browse R Packages

We want your feedback!

USCCANA/socnet
Web Scraping The Social Networks (SOCNET) Listserv