This vignette provides an introduction to the `netjack`

package and overviews common data input and analysis pipelines. For a tutorial about creating custom network functions and network statistics see the "Custom Functions in `netjack`

" vignette.

Samples of *registered* networks, or networks that consist of the same node set, are increasing common in a variety of scientific fields. The `netjack`

package implements an framework to let researchers quickly manipulate and analyze large samples of registered networks, as well as develop custom functionality that builds on the existing `netjack`

framework.

In this vignette, we go over the following procedures:

- Basic data objects and function classes in
`netjack`

- Data Input
- Network Manipulation Functions
- Network Statistic Functions
- Difference, Group and Group Difference Testing.

`netjack`

is built around a series of S4 classes that represent different levels of network manipulation, and functions that act on each level of network manipulation. This section describes these classes and functions at a summary level.

The most basic data object is the `Net`

object. This represents a single network, along with node level variables, such as partition assignments.

A `NetSample`

object represents a collection of `Net`

objects, along with network level variables. For example, if each network in the sample represents a single individual's functional brain network, a network level variable could be the diagnostic status of each individual.

The `Net`

and `NetSample`

objects are representations of raw data. To work with these data objects, the `net_apply`

function can be used to apply a *network manipulation function*. This class of functions take a single network, and perform a series of manipulations on the network, returning the manipulated networks. As an example the `node_jackknife()`

function applied to a `Net`

object returns a set of `Net`

objects corresponding to the original network with each node removed in turn.

To represent the output of a `net_apply`

, the S4 classes `NetSet`

and `NetSampleSet`

are used. These classes represent both the original `Net`

or `NetSample`

, as well as the product of the `net_apply`

.

One common procedure that network analysis uses is the calculation of various network statistics. In `netjack`

*network statistic functions* can be used via the `net_stat_apply`

function to quickly be calculated for both `NetSet`

and `NetSampleSet`

objects. The output of a `net_stat_apply`

is a `NetStatSet`

or `NetSampleStatSet`

object.

`netjack`

implements several statistical testing procedures that are described below. Additionally, to extract a `data.frame`

of the calculated network statistics from a `NetStatSet`

or `NetSampleStatSet`

object, `to_data_frame()`

can be used.

To illustrate the various features of `netjack`

, two simulated datasets are provided, GroupA and GroupB. Networks can be loaded into the `netjack`

framework from adjacency matrices, either as single Net objects, or more commonly as one NetSample object.

library(netjack) data("GroupA") Subject1 <- as_Net(GroupA[[1]], "Subject1") show(Subject1)

Node Variables can be assigned during construction as a named list:

Subject2 <- as_Net(GroupA[[2]], "Subject1", node.variables = list(community = c(rep(1,10), rep(2,10)))) show(Subject2)

Typically, a researcher using `netjack`

is analyzing a sample of registered networks rather than a single network. `NetSample`

objects can be constructed in much the same way as a `Net`

object can, using lists of adjacency matrices rather than a single matrix:

GroupASamp = as_NetSample(GroupA, net.names = as.character(1:20) , node.variables = list(community = c(rep(1,10), rep(2,10))), sample.variables = list(group = rep(1, 20))) show(GroupASamp)

Importantly, when a NetSample object is created, the list of node variables is applied to every network. This is appropriate in registered network applications, where for example, in neuroimaging networks, each node represents a specific brain region, and each node is the same for each subject.

Sample variables represent network level characteristics. For example, if each network represents a functional connectivity network from a neuroimaging study, a sample variable might be the diagnostic status of a particular individual.

Once a sample of networks is represented as a `NetSample`

object, a network manipulation function can be applied. As described previously, these functions change a network in some way. As an example, the `node_jackknife`

function returns a set of networks, where each node has been removed in turn.

Network manipulation functions can be applied via `net_apply`

to a `Net`

object to produce a `NetSet`

, or can be applied via `net_apply`

to a `NetSample`

object to produce a `NetSampleSet`

Sub1Jackknifed <- net_apply(network = Subject1, net.function = "node_jackknife") show(Sub1Jackknifed) GroupAJackknifed <- net_apply(network = GroupASamp, net.function = "node_jackknife") show(GroupAJackknifed)

Network manipulation functions that involve node level variables can be used by including them in the `net.function.args`

argument within `net_apply`

. For example, `network_jackknife`

removes sub-networks on the basis of a node level grouping variable.

GroupANetJackknifed <- net_apply(GroupASamp, net.function = "network_jackknife", net.function.args = list(network.variable = "community")) show(GroupANetJackknifed)

Once a network manipulation function has been applied, network statistics can be computed.

A network statistic is a single numerical summary of some aspect of a network's structure or topology. `netjack`

focuses on the analysis of networks at a network statistic level, and provides simple interfaces for calculating network statistics on collections of networks.

Similar in structure to the network manipulation functions, network statistic functions are applied via a `net_stat_apply`

function, which can be used with either a `NetSet`

object, or a `NetSampleSet`

object. This produces a `NetStatSet`

and a `NetSampleStatSet`

respectively.

Sub1JackknifedGlobEff <- net_stat_apply(Sub1Jackknifed, net.stat.fun = global_efficiency) show(Sub1JackknifedGlobEff) GroupAJackknifedGlobEff <- net_stat_apply(GroupAJackknifed, net.stat.fun = global_efficiency) show(GroupAJackknifedGlobEff)

Once a `NetStatSet`

or `NetSampleStatSet`

has been computed, the computed network statistics can be extracted into a `data.frame`

by using the `to_data_frame`

function. The data frame returned is in long format, with a row for each manipulated network.

Sub1Data = to_data_frame(Sub1JackknifedGlobEff) names(Sub1Data) GroupAData = to_data_frame(GroupAJackknifedGlobEff) head(GroupAData)

`netjack`

implements three statistical testing procedures in easy to use functions for both tabular and graphical output. The first test is the *difference test* which assess if any specific network manipulation causes a significant difference from the original network in a given network statistic. This test is implemented with the `diff_test`

and graphically with the `net_ggPlot`

function. Plotting uses the `ggplot`

package, making the aesthetic presentation easily manipulated.

The example dataset `GroupA`

has been generated so that the removal of node 10 will result in a significant difference in global efficiency from the original networks. Below are the full set of steps for this analysis:

GroupASamp = as_NetSample(GroupA, net.names = as.character(1:20)) GroupAJackknifed = net_apply(GroupASamp, net.function = "node_jackknife") GroupAJackknifedGlobEff = net_stat_apply(GroupAJackknifed, net.stat.fun = "global_efficiency") diff_test(GroupAJackknifedGlobEff) diff_test_ggPlot(GroupAJackknifedGlobEff)

The second test implemented is the group test. This examines differences between to sample level groups (such as healthy controls and individuals with a disorder) in a network statistic, subject to a network manipulation.

In this example, `GroupA`

has been simulated to have node 10 be important for global efficiency, while `GroupB`

has node 15 as important for global efficiency. We combine these datasets into a single object, and perform the group testing now.

fullGroup = c(GroupA, GroupB) fullSamp = as_NetSample(fullGroup,net.names = as.character(1:40), sample.variables = list(group = c(rep("GroupA", 20), rep("GroupB", 20)))) fullSampJackknifed = net_apply(fullSamp, net.function = "node_jackknife") fullSampleJackknifedGlobEff = net_stat_apply(fullSampJackknifed, net.stat.fun = "global_efficiency") group_test(fullSampleJackknifedGlobEff, grouping.variable = "group") group_test_ggPlot(fullSampleJackknifedGlobEff, grouping.variable="group")

Finally, the group difference test assesses if the network manipulation has a differential impact on the network statistic between the groups. This test is implemented with `group_diff_test`

and graphically with `netGroupDiff_ggPlot`

.

group_diff_test(fullSampleJackknifedGlobEff, grouping.variable = "group") group_diff_test_ggPlot(fullSampleJackknifedGlobEff, grouping.variable="group")

From this, we can see that when node 10 or node 15 are removed, this results in a significantly different change from the original global efficiency value between Group A and Group B.

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.