# Suppress title check warning options(rmarkdown.html_vignette.check_title = FALSE) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(NetCoMi)
One of NetCoMi's strengths is the ability to compare networks between two groups.
The netCompare()
function is used for this task.
The amgut data set is split by "SEASONAL_ALLERGIES"
leading to two subsets of
samples (with and without seasonal allergies). We ignore the "None" group.
data("amgut2.filt.phy") # Split the phyloseq object into two groups amgut_season_yes <- phyloseq::subset_samples(amgut2.filt.phy, SEASONAL_ALLERGIES == "yes") amgut_season_no <- phyloseq::subset_samples(amgut2.filt.phy, SEASONAL_ALLERGIES == "no") amgut_season_yes amgut_season_no
The 50 nodes with highest variance are selected for network construction to get smaller networks.
We filter the 121 samples (sample size of the smaller group) with highest frequency to make the sample sizes equal and thus ensure comparability.
n_yes <- phyloseq::nsamples(amgut_season_yes) # Network construction net_season <- netConstruct(data = amgut_season_no, data2 = amgut_season_yes, filtTax = "highestVar", filtTaxPar = list(highestVar = 50), filtSamp = "highestFreq", filtSampPar = list(highestFreq = n_yes), measure = "spring", measurePar = list(nlambda = 10, rep.num = 10, Rmethod = "approx"), normMethod = "none", zeroMethod = "none", sparsMethod = "none", dissFunc = "signed", verbose = 2, seed = 123456)
Alternatively, a group vector could be passed to group
, according to which
the data set is split into two groups:
# Get count table countMat <- phyloseq::otu_table(amgut2.filt.phy) # netConstruct() expects samples in rows countMat <- t(as(countMat, "matrix")) group_vec <- phyloseq::get_variable(amgut2.filt.phy, "SEASONAL_ALLERGIES") # Select the two groups of interest (level "none" is excluded) sel <- which(group_vec %in% c("no", "yes")) group_vec <- group_vec[sel] countMat <- countMat[sel, ] net_season <- netConstruct(countMat, group = group_vec, filtTax = "highestVar", filtTaxPar = list(highestVar = 50), filtSamp = "highestFreq", filtSampPar = list(highestFreq = n_yes), measure = "spring", measurePar = list(nlambda=10, rep.num=10, Rmethod = "approx"), normMethod = "none", zeroMethod = "none", sparsMethod = "none", dissFunc = "signed", verbose = 3, seed = 123456)
The object returned by netConstruct()
containing both networks is again
passed to netAnalyze()
. Network properties are computed for both networks
simultaneously.
To demonstrate further functionalities of netAnalyze()
, we play around with
the available arguments, even if the chosen setting might not be optimal.
centrLCC = FALSE
: Centralities are calculated for all nodes (not only for the
largest connected component).avDissIgnoreInf = TRUE
: Nodes with an infinite dissimilarity are ignored
when calculating the average dissimilarity.sPathNorm = FALSE
: Shortest paths are not normalized by average dissimilarity.hubPar = c("degree", "eigenvector")
: Hubs are nodes with highest
degree and eigenvector centrality at the same time.lnormFit = TRUE
and hubQuant = 0.9
: A log-normal distribution is fitted to
the centrality values to identify nodes with "highest" centrality values.
Here, a node is identified as hub if for each of the three centrality measures,
the node's centrality value is above the 90% quantile of the fitted log-normal
distribution.Note! The arguments must be set carefully, depending on the research questions. NetCoMi's default values are not generally preferable in all practical cases!
props_season <- netAnalyze(net_season, centrLCC = FALSE, avDissIgnoreInf = TRUE, sPathNorm = FALSE, clustMethod = "cluster_fast_greedy", hubPar = c("degree", "eigenvector"), hubQuant = 0.9, lnormFit = TRUE, normDeg = FALSE, normBetw = FALSE, normClose = FALSE, normEigen = FALSE) summary(props_season)
First, the layout is computed separately in both groups (qgraph's "spring" layout in this case).
Node sizes are scaled according to the mclr-transformed data since SPRING
uses
the mclr transformation as normalization method.
Node colors represent clusters. Note that by default, two clusters have the same
color in both groups if they have at least two nodes in common
(sameColThresh = 2
). Set sameClustCol
to FALSE
to get different cluster colors.
plot(props_season, sameLayout = FALSE, nodeColor = "cluster", nodeSize = "mclr", labelScale = FALSE, cexNodes = 1.5, cexLabels = 2.5, cexHubLabels = 3, cexTitle = 3.7, groupNames = c("No seasonal allergies", "Seasonal allergies"), hubBorderCol = "gray40") legend("bottom", title = "estimated association:", legend = c("+","-"), col = c("#009900","red"), inset = 0.02, cex = 4, lty = 1, lwd = 4, bty = "n", horiz = TRUE)
Using different layouts leads to a "nice-looking" network plot for each group, however, it is difficult to identify group differences at first glance.
Thus, we now use the same layout in both groups. In the following, the layout is computed for group 1 (the left network) and taken over for group 2.
rmSingles
is set to "inboth"
because only nodes that are unconnected in both
groups can be removed if the same layout is used.
plot(props_season, sameLayout = TRUE, layoutGroup = 1, rmSingles = "inboth", nodeSize = "mclr", labelScale = FALSE, cexNodes = 1.5, cexLabels = 2.5, cexHubLabels = 3, cexTitle = 3.8, groupNames = c("No seasonal allergies", "Seasonal allergies"), hubBorderCol = "gray40") legend("bottom", title = "estimated association:", legend = c("+","-"), col = c("#009900","red"), inset = 0.02, cex = 4, lty = 1, lwd = 4, bty = "n", horiz = TRUE)
In the above plot, we can see clear differences between the groups. The OTU "322235", for instance, is more strongly connected in the "Seasonal allergies" group than in the group without seasonal allergies, which is why it is a hub on the right, but not on the left.
However, if the layout of one group is simply taken over to the other, one of
the networks (here the "seasonal allergies" group) is usually not that
nice-looking due to the long edges.
Therefore, NetCoMi (>= 1.0.2) offers a further
option (layoutGroup = "union"
), where a union of the two layouts is used
in both groups.
In doing so, the nodes are placed as optimal as possible equally for both networks.
The idea and R code for this functionality were provided by Christian L. Müller and Alice Sommer
plot(props_season, sameLayout = TRUE, repulsion = 0.95, layoutGroup = "union", rmSingles = "inboth", nodeSize = "mclr", labelScale = FALSE, cexNodes = 1.5, cexLabels = 2.5, cexHubLabels = 3, cexTitle = 3.8, groupNames = c("No seasonal allergies", "Seasonal allergies"), hubBorderCol = "gray40") legend("bottom", title = "estimated association:", legend = c("+","-"), col = c("#009900","red"), inset = 0.02, cex = 4, lty = 1, lwd = 4, bty = "n", horiz = TRUE)
Since runtime is considerably increased if permutation tests are
performed, we set the permTest
parameter to FALSE
. See the
tutorial_createAssoPerm
file for a network comparison including permutation
tests.
Since permutation tests are still conducted for the Adjusted Rand Index, a seed should be set for reproducibility.
comp_season <- netCompare(props_season, permTest = FALSE, verbose = FALSE, seed = 123456) summary(comp_season, groupNames = c("No allergies", "Allergies"), showCentr = c("degree", "between", "closeness"), numbNodes = 5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.