knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Let's first generate a theoretical simulation:
library(SimuNet) library(magrittr) set.seed(42) n <- 5L samp.effort <- 100L Adj <- sample(1:samp.effort,n * n) |> matrix(nrow = n,dimnames = list(letters[1:n],letters[1:n])) Adj[lower.tri(Adj,diag = TRUE)] <- 0L Adj sL <- simunet(Adj,samp.effort,"upper",10) sL
Experimental designs - expDesign
objects in SimuNet
- are successions of manipulations that are
applied to a scanList
, i.e. a 3-dimensional array.
To design an experiment, simply input functions as successive arguments of design_exp()
:
my_fun1 <- function(scan.list) { # do things on scan.list scan.list } my_fun2 <- function(scan.list) { # do other things on scan.list scan.list } design_exp(my_fun1,remove_mostCentral,my_fun2)
You can store design_exp()
's output and apply it to scanList
via perform_exp()
.
An expDesign
can be expanded by inputting it in design_exp()
with other functions or
expDesign
.
eD1 <- design_exp(my_fun1,remove_mostCentral) design_exp(eD1,my_fun2) # this is similar to... # ... this: design_exp(my_fun1,remove_mostCentral,my_fun2)
Since design_sampling()
also returns expDesign
objects, this is how one can include
a sampling method within an experimental design:
design_exp(my_fun1,design_sampling("group",0.9)) foc.even <- design_sampling("focal","even") foc.even design_exp(my_fun2,foc.even,my_fun1)
design_exp()
uses purrr
's purrr::compose()
to encapsulate the sequence of manipulations into a
single function stored as FUN.seq
in the expDesign
object, to be applied to the scanList
.
Manipulation "building blocks" are included in SimuNet
to be used in experimental designs, but
users should feel free to create their own functions to use in a sequence.
WIP: SimuNet
is planned to later include a feature to automatically carry the scanList
's
attribute along the sequence manipulations, whether they include user-defined functions or not. In
the meantime, copy_attrs_to(from,to)
allows for instance to copy attrs from an original scanList
to its modified version.
Let's try out a concrete example: say you want to assess how would the adjacency matrix be affected if the overall most central node had not been there (but nothing else would have changed).
remove_mostCentral()
does just that, and can be included alone or in combination in an
expDesign
. Let's look at its code:
getS3method("remove_mostCentral","scanList") # for convenience reasons, SimuNet's building blocks # are written as S3 methods, but this isn't required
We can see that it:
scanList
into a weighted adjacency matrixscanList
, with the copied attrs
Imagine what manipulation you too can design and use in SimuNet
!
Let's include remove_mostCentral()
in its own experimental design, and apply it to sL
removeC <- design_exp(remove_mostCentral) removeC Cremoved <- sL |> perform_exp(removeC) Cremoved Cremoved |> sum_scans()
We can see that node e was removed from all scans.
Let's have a look at the original nodes' centrality:
sL |> sum_scans() |> igraph::graph.adjacency(weighted = TRUE) |> igraph::eigen_centrality(directed = FALSE) %>% .$vector
as expected, e had the highest eigen-vector centrality value.
Let see for instance how this impacted the network's clustering coefficient:
sL |> sum_scans() %>% {class(.) <- NULL;.} %>% {. + t(.)} |> # DirectedClustering requires symmetrical matrices for undirected networks DirectedClustering::ClustBCG(type = "undirected") Cremoved |> sum_scans() %>% {class(.) <- NULL;.} %>% {. + t(.)} |> DirectedClustering::ClustBCG(type = "undirected")
What would have happened if, in addition to having the most central individual removed, we also missed 20% of the edges' value when doing group-scans?
Let's design this second experiment:
removeC2 <- design_exp(remove_mostCentral,design_sampling("group",0.8)) removeC2 Cremoved2 <- sL |> perform_exp(removeC2) Cremoved2 Cremoved2 |> sum_scans()
Let's push further the previous example: we will replicate the design on a larger network containing more nodes:
set.seed(42) n <- 10L samp.effort <- 300L Adj <- sample(1:samp.effort,n * n) |> matrix(nrow = n,dimnames = list(letters[1:n],letters[1:n])) Adj[lower.tri(Adj,diag = TRUE)] <- 0L Adj sL <- simunet(Adj,samp.effort,"upper",100) sL
Replicating theoretical simulations allows to assess the clustering coefficient distribution:
get_undirectedCC <- function(scanList) { CC <- scanList |> sum_scans() %>% {class(.) <- NULL;.} %>% # modify the sumeed scanList into a class-less matrix {. + t(.)} |> # symmetrize the triangular matrix by adding its transposed matrix DirectedClustering::ClustBCG(type = "undirected") CC$GlobalCC } # calculate the clustering coefficient from a single simulation simunet(Adj,samp.effort,"upper",100) |> get_undirectedCC() # Replicating the operation CC.theo <- data.frame(type = "theoretical", CC = replicate(100, simunet(Adj,samp.effort,"upper",100) |> get_undirectedCC() ) ) summary(CC.theo)
Let's do the same when the most central node is removed, and 20% of the edges are missed:
# calculate the clustering coefficient from a single simulation simunet(Adj,samp.effort,"upper",100,removeC2) |> get_undirectedCC() # Replicating the operation CC.empi <- data.frame(type = factor(c("theoretical","empirical"), levels = c("theoretical","empirical") ), rep = rep(1:100,each = 2) |> as.factor(), CC = replicate(100,simplify = FALSE, # TODO: come up with more elegant code { sL <- simunet(Adj,samp.effort,"upper",100,removeC2) CC.theo <- sL$theoretical.scanList |> get_undirectedCC() CC.empi <- sL |> get_undirectedCC() c(CC.theo,CC.empi) } ) |> do.call(what = c) ) summary(CC.empi) library(ggplot2) CC.empi |> ggplot(aes(type,CC,colour = type,fill = type))+ geom_boxplot(alpha = .5)+ geom_line(aes(group = rep),alpha = 0.05,colour = "grey50")+ geom_point(alpha = 0.02)+ guides(fill = "none",colour = "none")+ scale_x_discrete(labels = c("Original","Most central node removed\n+\n20% edges missed"))+ labs(x = "",y = "Mean Clustering Coefficient")+ theme_minimal(15)+ theme(plot.background = element_rect(fill = 'white', colour = NA))
As seen above, combining SimuNet
's experimental design approach in combination with R's
replicate()
(or lapply()
/sapply()
to allow handling of the replication index) allows for
varied and customized collection of simulated data!
SimuNet
includes some helper functions to convert objects to fit your needs.
These helper functions include:
scanList2matList()
: convert a scanList
3D array into an R list of 2D matricesmatList2scanList()
: does the opposite as scanList2matList()
igraph::graph.adjacency()
- you can use
flexibly many network manipulations in your experimental design!Let's implement manipulations to apply on a list of matrices:
# This function mask a random node from a matrix mask_randomNode <- function(Adj) { to.mask <- sample(1:nrow(Adj),1) Adj[c(to.mask),] <- NA Adj[,c(to.mask)] <- NA Adj } # a scanList can be transformed to a list of matrices, and lapply can be used to apply # mask_randomNode to each. matList2scanList back transforms it: mask_inEachScan <- function(sL) { mL <- scanList2matList(sL) mL |> lapply(mask_randomNode) |> copy_attrs_to(from = mL) |> # keeps track of sL's attributes matList2scanList() } rand.mask <- design_exp(mask_inEachScan) rand.mask sL |> perform_exp(rand.mask)
Let's see with another example relying on igraph networks:
# This function mask a random node from a matrix rewire_each <- function(sL) { mode <- sL$mode sLapply( # wrapper for lapply over scanLists' 3rd dimension. See ?sLapply sL, function(scan) { scan |> igraph::graph.adjacency(mode = mode,weighted = TRUE) |> # transform to igraph network igraph::rewire(igraph::each_edge(p = .2, loops = FALSE)) |> # rewire networks igraph::get.adjacency(type = "upper",sparse = FALSE) # transform back to matrix } ) } rand.rewire <- design_exp(rewire_each) rand.rewire rewired <- sL |> perform_exp(rand.rewire) sL[,,3:4] rewired[,,3:4]
Coming soon!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.