library(knitr) opts_chunk$set(message=FALSE, warning=FALSE, eval=TRUE, echo=TRUE)
epicontacts aims to facilitate manipulation, visualisation and analysis of
epidemiological contact data. Such datasets inherently have network components,
in which nodes are typically cases and reported contacts or exposures are
(directed or undirected) edges. This package provides a convenient data
structure as well as functionality specific to handle these data.
epicontacts provides a convenient structure to store heterogeneous
epidemiological contact network data (i.e. nodes and edges) in a single
epicontacts class must contain two components: a line list and a
Each row of the line list should represent unique observations of cases, and each row of the contact list should represent unique pairs of contacts. Each can include arbitrary features, but both datasets should share an identification scheme.
The example that follows will use the
mers_korea_2015, which is a dataset (in
list format) distributed in the outbreaks package.
What features are in the line list?
What about the contact dataset?
In order to create the
epicontacts object, both the line list and contact data
frames must be passed to
make_epicontacts(). This function accommodates
instances when the respective identifiers are not the first columns of these
data frames (see the "id", "from" and "to" arguments).
also account for contact networks that have a direction (see "directed"
merskor15 <- make_epicontacts(linelist = mers_korea_2015$linelist, contacts = mers_korea_2015$contacts, directed = FALSE) class(merskor15) summary(merskor15)
summary() method above provided counts for the number unique cases in both
the contact and line list. The
get_id() function retrieves similar information
but as vectors of identifiers. This can be parameterized as follows:
What are the first ten IDs in the contacts dataset?
contacts_ids <- get_id(merskor15, "contacts") head(contacts_ids, n = 10)
How many IDs are common to both?
subset() method for
epicontacts objects allows for, among other things,
pruning of networks based on values of node and edge attributes. These values
must be passed as named lists to the respective argument.
subset(merskor15, node_attribute = list("outcome" = "Dead", "sex" = "M"), edge_attribute = list("exposure" = "Emergency room"))
In addition to subsetting by node and edge attributes, networks can be pruned to only include components that are connected to certain nodes. The "id" argument takes a vector of nodes and returns the line list of individuals that "touch" those IDs.
nodes <- c("SK_14","SK_145") subset(merskor15, cluster_id = nodes)
subset() method for
epicontacts objects also accepts cluster size
parameters (see "cs", "cs_min" and "cs_max" arguments).
subset(merskor15, cs = 3) subset(merskor15, cs_min = 10, cs_max = 100)
One of the main features of
epicontacts is its visualisation capabilities. As
a default, the package uses interactive plotting based on the
package^2. This interactivity is particularly useful for visualising large
The above is a generic method based on the
vis_epicontacts() and accepts a
number of arguments to customize the plot appearance and functionality. For a
full list of options use
?vis_epicontacts(). For instance, one can customize
nodes using colors and icons:
plot(merskor15, "place_infect", node_shape = "sex", shapes = c(M = "male", F = "female"))
codeawesome to see available shapes.
for plotting can be
graph3D, in which case a 3-dimensional graph will be used
epicontacts loads the
threejs package to enable 3D visualisation tools with
graph3D(merskor15, node_color = "sex", g_title = "MERS Korea 2014")
To interact with the plot:
get_pairwise() function allows processing of variable(s) in the line list
according to each pair in the contact dataset. For the following example, date
of onset of disease is extracted from the line list in order to compute the
difference between disease date of onset for each pair. The value that is
produced from this comparison represents the serial interval (si).
si <- get_pairwise(merskor15, "dt_onset") summary(si) hist(si, col="grey", border="white", xlab="Days after symptoms", main="MERS Korea 2014\nSerial Interval")
get_pairwise() will interpret the class of the column being used for
comparison, and will adjust its method of comparing the values accordingly. For
numbers and dates (like the si example above), the function will subtract
the values. When applied to columns that are characters or categorical,
get_pairwise() will paste values together. Because the function also allows
for arbitrary processing (see "f" argument), these discrete combinations can be
easily tabulated and analyzed.
head(get_pairwise(merskor15, "sex"), n = 10) get_pairwise(merskor15, "sex", f=table) fisher.test(get_pairwise(merskor15, "sex", f=table))
get_clusters() function can be used for to identify connected components
epicontacts object. Here, we illustrate its use to study contact
patterns in a simulated Ebola outbreak. First, we use it to retrieve
data.frame containing the cluster information:
x <- make_epicontacts(ebola_sim$linelist, ebola_sim$contacts, id = "case_id", to = "case_id", from = "infector", directed = TRUE) x clust <- get_clusters(x, output = "data.frame") class(clust) dim(clust) table(clust$cluster_size) barplot(table(clust$cluster_size), main = "Cluster size distribution", xlab = "Cluster size", ylab = "Frequency")
Let us look at the largest clusters. For this, we add cluster information to the
epicontacts object, and then subset it:
x <- get_clusters(x) x_14 <- subset(x, cs = 14) plot(x_14, "cluster_member")
The degree of a node corresponds to its number of edges or connections to other
get_degree() provides an easy method for calculating this value for
epicontacts networks. A high degree in this context indicates an individual
who was in contact with many others.
nb use of "type" argument depends on whether or not the network is directed.
deg_both <- get_degree(merskor15, "both", only_linelist = TRUE)
Which individuals have the ten most contacts?
head(sort(deg_both, decreasing = TRUE), 10)
What is the mean number of contacts?
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.