This vignette provides an introduction to the
netjack package and overviews common data input and analysis pipelines. For a tutorial about creating custom network functions and network statistics see the "Custom Functions in
Samples of registered networks, or networks that consist of the same node set, are increasing common in a variety of scientific fields. The
netjack package implements an framework to let researchers quickly manipulate and analyze large samples of registered networks, as well as develop custom functionality that builds on the existing
In this vignette, we go over the following procedures:
netjack is built around a series of S4 classes that represent different levels of network manipulation, and functions that act on each level of network manipulation. This section describes these classes and functions at a summary level.
The most basic data object is the
Net object. This represents a single network, along with node level variables, such as partition assignments.
NetSample object represents a collection of
Net objects, along with network level variables. For example, if each network in the sample represents a single individual's functional brain network, a network level variable could be the diagnostic status of each individual.
NetSample objects are representations of raw data. To work with these data objects, the
net_apply function can be used to apply a network manipulation function. This class of functions take a single network, and perform a series of manipulations on the network, returning the manipulated networks. As an example the
node_jackknife() function applied to a
Net object returns a set of
Net objects corresponding to the original network with each node removed in turn.
To represent the output of a
net_apply, the S4 classes
NetSampleSet are used. These classes represent both the original
NetSample, as well as the product of the
One common procedure that network analysis uses is the calculation of various network statistics. In
netjack network statistic functions can be used via the
net_stat_apply function to quickly be calculated for both
NetSampleSet objects. The output of a
net_stat_apply is a
netjack implements several statistical testing procedures that are described below. Additionally, to extract a
data.frame of the calculated network statistics from a
to_data_frame() can be used.
To illustrate the various features of
netjack, two simulated datasets are provided, GroupA and GroupB. Networks can be loaded into the
netjack framework from adjacency matrices, either as single Net objects, or more commonly as one NetSample object.
library(netjack) data("GroupA") Subject1 <- as_Net(GroupA[], "Subject1") show(Subject1)
Node Variables can be assigned during construction as a named list:
Subject2 <- as_Net(GroupA[], "Subject1", node.variables = list(community = c(rep(1,10), rep(2,10)))) show(Subject2)
Typically, a researcher using
netjack is analyzing a sample of registered networks rather than a single network.
NetSample objects can be constructed in much the same way as a
Net object can, using lists of adjacency matrices rather than a single matrix:
GroupASamp = as_NetSample(GroupA, net.names = as.character(1:20) , node.variables = list(community = c(rep(1,10), rep(2,10))), sample.variables = list(group = rep(1, 20))) show(GroupASamp)
Importantly, when a NetSample object is created, the list of node variables is applied to every network. This is appropriate in registered network applications, where for example, in neuroimaging networks, each node represents a specific brain region, and each node is the same for each subject.
Sample variables represent network level characteristics. For example, if each network represents a functional connectivity network from a neuroimaging study, a sample variable might be the diagnostic status of a particular individual.
Once a sample of networks is represented as a
NetSample object, a network manipulation function can be applied. As described previously, these functions change a network in some way. As an example, the
node_jackknife function returns a set of networks, where each node has been removed in turn.
Network manipulation functions can be applied via
net_apply to a
Net object to produce a
NetSet, or can be applied via
net_apply to a
NetSample object to produce a
Sub1Jackknifed <- net_apply(network = Subject1, net.function = "node_jackknife") show(Sub1Jackknifed) GroupAJackknifed <- net_apply(network = GroupASamp, net.function = "node_jackknife") show(GroupAJackknifed)
Network manipulation functions that involve node level variables can be used by including them in the
net.function.args argument within
net_apply. For example,
network_jackknife removes sub-networks on the basis of a node level grouping variable.
GroupANetJackknifed <- net_apply(GroupASamp, net.function = "network_jackknife", net.function.args = list(network.variable = "community")) show(GroupANetJackknifed)
Once a network manipulation function has been applied, network statistics can be computed.
A network statistic is a single numerical summary of some aspect of a network's structure or topology.
netjack focuses on the analysis of networks at a network statistic level, and provides simple interfaces for calculating network statistics on collections of networks.
Similar in structure to the network manipulation functions, network statistic functions are applied via a
net_stat_apply function, which can be used with either a
NetSet object, or a
NetSampleSet object. This produces a
NetStatSet and a
Sub1JackknifedGlobEff <- net_stat_apply(Sub1Jackknifed, net.stat.fun = global_efficiency) show(Sub1JackknifedGlobEff) GroupAJackknifedGlobEff <- net_stat_apply(GroupAJackknifed, net.stat.fun = global_efficiency) show(GroupAJackknifedGlobEff)
NetSampleStatSet has been computed, the computed network statistics can be extracted into a
data.frame by using the
to_data_frame function. The data frame returned is in long format, with a row for each manipulated network.
Sub1Data = to_data_frame(Sub1JackknifedGlobEff) names(Sub1Data) GroupAData = to_data_frame(GroupAJackknifedGlobEff) head(GroupAData)
netjack implements three statistical testing procedures in easy to use functions for both tabular and graphical output. The first test is the difference test which assess if any specific network manipulation causes a significant difference from the original network in a given network statistic. This test is implemented with the
diff_test and graphically with the
net_ggPlot function. Plotting uses the
ggplot package, making the aesthetic presentation easily manipulated.
The example dataset
GroupA has been generated so that the removal of node 10 will result in a significant difference in global efficiency from the original networks. Below are the full set of steps for this analysis:
GroupASamp = as_NetSample(GroupA, net.names = as.character(1:20)) GroupAJackknifed = net_apply(GroupASamp, net.function = "node_jackknife") GroupAJackknifedGlobEff = net_stat_apply(GroupAJackknifed, net.stat.fun = "global_efficiency") diff_test(GroupAJackknifedGlobEff) diff_test_ggPlot(GroupAJackknifedGlobEff)
The second test implemented is the group test. This examines differences between to sample level groups (such as healthy controls and individuals with a disorder) in a network statistic, subject to a network manipulation.
In this example,
GroupA has been simulated to have node 10 be important for global efficiency, while
GroupB has node 15 as important for global efficiency. We combine these datasets into a single object, and perform the group testing now.
fullGroup = c(GroupA, GroupB) fullSamp = as_NetSample(fullGroup,net.names = as.character(1:40), sample.variables = list(group = c(rep("GroupA", 20), rep("GroupB", 20)))) fullSampJackknifed = net_apply(fullSamp, net.function = "node_jackknife") fullSampleJackknifedGlobEff = net_stat_apply(fullSampJackknifed, net.stat.fun = "global_efficiency") group_test(fullSampleJackknifedGlobEff, grouping.variable = "group") group_test_ggPlot(fullSampleJackknifedGlobEff, grouping.variable="group")
Finally, the group difference test assesses if the network manipulation has a differential impact on the network statistic between the groups. This test is implemented with
group_diff_test and graphically with
group_diff_test(fullSampleJackknifedGlobEff, grouping.variable = "group") group_diff_test_ggPlot(fullSampleJackknifedGlobEff, grouping.variable="group")
From this, we can see that when node 10 or node 15 are removed, this results in a significantly different change from the original global efficiency value between Group A and Group B.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.