netCompare | R Documentation |
Calculate and compare network properties for microbial networks using Jaccard's index, the Rand index, the Graphlet Correlation Distance, and permutation tests.
netCompare(
x,
permTest = FALSE,
jaccQuant = 0.75,
lnormFit = NULL,
testRand = TRUE,
nPermRand = 1000L,
gcd = TRUE,
gcdOrb = c(0, 2, 5, 7, 8, 10, 11, 6, 9, 4, 1),
verbose = TRUE,
nPerm = 1000L,
adjust = "adaptBH",
trueNullMethod = "convest",
cores = 1L,
logFile = NULL,
seed = NULL,
fileLoadAssoPerm = NULL,
fileLoadCountsPerm = NULL,
storeAssoPerm = FALSE,
fileStoreAssoPerm = "assoPerm",
storeCountsPerm = FALSE,
fileStoreCountsPerm = c("countsPerm1", "countsPerm2"),
returnPermProps = FALSE,
returnPermCentr = FALSE,
assoPerm = NULL,
dissPerm = NULL
)
x |
object of class |
permTest |
logical. If |
jaccQuant |
numeric value between 0 and 1 specifying the quantile used as threshold to identify the most central nodes for each centrality measure. The resulting sets of nodes are used to calculate Jaccard's index (see details). Default is 0.75. |
lnormFit |
logical indicating whether a log-normal distribution should
be fitted to the calculated centrality values for determining Jaccard's
index (see details). If |
testRand |
logical. If |
nPermRand |
integer giving the number of permutations used for testing
the adjusted Rand index for being significantly different from zero.
Ignored if |
gcd |
logical. If |
gcdOrb |
numeric vector with integers from 0 to 14 defining the orbits used for calculating the GCD. Minimum length is 2. Defaults to c(0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11), thus excluding redundant orbits such as the orbit o3. |
verbose |
logical. If |
nPerm |
integer giving the number of permutations if
|
adjust |
character indicating the method used for multiple testing
adjustment of the permutation p-values. Possible values are |
trueNullMethod |
character indicating the method used for estimating the
proportion of true null hypotheses from a vector of p-values. Used for the
adaptive Benjamini-Hochberg method for multiple testing adjustment (chosen
by |
cores |
integer indicating the number of CPU cores used for
permutation tests. If cores > 1, the tests are performed in parallel.
Is limited to the number of available CPU cores determined by
|
logFile |
character string naming the log file to which the current
iteration number is written (if permutation tests are performed). Defaults
to |
seed |
integer giving a seed for reproducibility of the results. |
fileLoadAssoPerm |
character giving the name or path (without file
extension) of the file containing the "permuted" association/dissimilarity
matrices that was generated by setting |
fileLoadCountsPerm |
character giving the name or path (without file
extension) of the file containing the "permuted" count matrices that was
generated by setting |
storeAssoPerm |
logical indicating whether the association/dissimilarity
matrices for the permuted data should be saved to a file.
The file name is given via |
fileStoreAssoPerm |
character giving the name of a file to which the matrix with associations/dissimilarities of the permuted data is saved. Can also be a path. |
storeCountsPerm |
logical indicating whether the permuted count matrices
should be saved to an external file. Defaults to |
fileStoreCountsPerm |
character vector with two elements giving the names of two files storing the permuted count matrices belonging to the two groups. |
returnPermProps |
logical. If |
returnPermCentr |
logical. If |
assoPerm |
only needed for output generated with NetCoMi v1.0.1! A list
with two elements used for the permutation procedure.
Each entry must contain association matrices for |
dissPerm |
only needed for output generated with NetCoMi v1.0.1!
Usage analog to |
Permutation procedure:
Used for testing centrality measures and global network properties for
group differences.
The null hypothesis of the tests is defined as
H_0: c1_i - c2_i = 0,
where c1_i
and
c2_i
denote the centrality values of taxon i in group 1 and 2,
respectively.
To generate a sampling distribution of the differences under H_0
,
the group labels are randomly reassigned to the samples while the group
sizes are kept. The associations are then re-estimated for each permuted
data set. The p-values are calculated as the proportion of
"permutation-differences" being larger than or equal to the observed
difference. In non-exact tests, a pseudo-count is added to the numerator
and denominator to avoid p-values of zero. Several methods for adjusting
the p-values for multiplicity are available.
Jaccard's index:
Jaccard's index expresses for each centrality measure how equal the sets of
most central nodes are among the two networks.
These sets are defined as nodes with a centrality value above a defined
quantile (via jaccQuant
) either of the empirical distribution of the
centrality values (lnormFit = FALSE
) or of a fitted log-normal
distribution (lnormFit = TRUE
).
The index ranges from 0 to 1, where 1 means the sets of most central nodes
are exactly equal in both networks and 0 indicates that the
most central nodes are completely different.
The index is calculated as suggested by Real and Vargas (1996).
Rand index:
The Rand index is used to express whether the determined clusterings are
equal in both groups. The adjusted Rand index (ARI) ranges from -1 to 1,
where 1 indicates that the two clusterings are exactly equal. The expected
index value for two random clusterings is 0. The implemented test procedure
is in accordance with the explanations in Qannari et al. (2014),
where a p-value below the alpha levels means that ARI is significantly
higher than expected for two random clusterings.
Graphlet Correlation Distance:
A graphlet-based distance measure, which is defined as the Euclidean
distance of the upper triangle values of the Graphlet Correlation
Matrices (GCM) of two networks (Yaveroglu et al., 2014).
The GCM of a network is a matrix with Spearman's correlations between the
network's node orbits (Hocevar and Demsar, 2016).
See calcGCD
for details.
Object of class microNetComp
with the following
elements:
jaccDeg,jaccBetw,jaccClose,jaccEigen | Values of Jaccard's index for the centrality measures |
jaccHub | Jaccard index for the sets of hub nodes |
randInd | Adjusted Rand index |
randIndLCC | Adjusted Rand index for the largest connected component (LCC) |
gcd | Graphlet Correlation Distance (object of class gcd
returned by calcGCD ) |
gcdLCC | Graphlet Correlation Distance for the LCC |
properties | List with calculated network properties |
propertiesLCC | List with calculated network properties of the LCC |
diffGlobal | Vectors with differences of global properties |
diffGlobalLCC | Vectors with differences of global properties for the LCC |
diffCent | Vectors with differences of the centrality values |
countMatrices | The two count matrices returned
by netConstruct |
assoMatrices | The two association matrices returned
by netConstruct |
dissMatrices | The two dissimilarity matrices returned
by netConstruct |
adjaMatrices | The two adjacency matrices returned
by netConstruct |
groups | Group names returned by netConstruct |
paramsProperties | Parameters used for network analysis |
Additional output if permutation tests are conducted:
pvalDiffGlobal | P-values of the tests for differential global properties |
pvalDiffGlobalLCC | P-values of the tests for differential global properties in the LCC |
pvalDiffCentr | P-values of the tests for differential centrality values |
pvalDiffCentrAdjust | Adjusted p-values of the tests for differential centrality values |
permDiffGlobal | nPerm x 10 matrix containing the absolute
differences of the ten global network properties (computed for the whole
network) for all nPerm permutations |
permDiffGlobalLCC | nPerm x 11 matrix containing the
absolute differences of the eleven global network properties (computed for
the LCC) for all nPerm permutations |
permDiffCentr | List with absolute differences of the four
centrality measures for all nPerm permutations. Each list contains
a nPerm x nNodes matrix.
|
benjamini2000adaptiveNetCoMi
\insertReffarcomeni2007someNetCoMi
\insertRefgill2010statisticalNetCoMi
\insertRefhocevar2016computationNetCoMi
\insertRefqannari2014significanceNetCoMi
\insertRefreal1996probabilisticNetCoMi
\insertRefyaveroglu2014revealingNetCoMi
summary.microNetComp
, netConstruct
,
netAnalyze
# Load data sets from American Gut Project (from SpiecEasi package)
data("amgut2.filt.phy")
# Split data into two groups: with and without seasonal allergies
amgut_season_yes <- phyloseq::subset_samples(amgut2.filt.phy,
SEASONAL_ALLERGIES == "yes")
amgut_season_no <- phyloseq::subset_samples(amgut2.filt.phy,
SEASONAL_ALLERGIES == "no")
amgut_season_yes
amgut_season_no
# Filter the 121 samples (sample size of the smaller group) with highest
# frequency to make the sample sizes equal and thus ensure comparability.
n_yes <- phyloseq::nsamples(amgut_season_yes)
# Network construction
amgut_net <- netConstruct(data = amgut_season_yes,
data2 = amgut_season_no,
measure = "pearson",
filtSamp = "highestFreq",
filtSampPar = list(highestFreq = n_yes),
filtTax = "highestVar",
filtTaxPar = list(highestVar = 30),
zeroMethod = "pseudoZO", normMethod = "clr")
# Network analysis
# Note: Please zoom into the GCM plot or open a new window using:
# x11(width = 10, height = 10)
amgut_props <- netAnalyze(amgut_net, clustMethod = "cluster_fast_greedy")
# Network plot
plot(amgut_props,
sameLayout = TRUE,
title1 = "Seasonal allergies",
title2 = "No seasonal allergies")
#--------------------------
# Network comparison
# Without permutation tests
amgut_comp1 <- netCompare(amgut_props, permTest = FALSE)
summary(amgut_comp1)
# With permutation tests (with only 100 permutations to decrease runtime)
amgut_comp2 <- netCompare(amgut_props,
permTest = TRUE,
nPerm = 100L,
cores = 1L,
storeCountsPerm = TRUE,
fileStoreCountsPerm = c("countsPerm1",
"countsPerm2"),
storeAssoPerm = TRUE,
fileStoreAssoPerm = "assoPerm",
seed = 123456)
# Rerun with a different adjustment method ...
# ... using the stored permutation count matrices
amgut_comp3 <- netCompare(amgut_props, adjust = "BH",
permTest = TRUE, nPerm = 100L,
fileLoadCountsPerm = c("countsPerm1",
"countsPerm2"),
seed = 123456)
# ... using the stored permutation association matrices
amgut_comp4 <- netCompare(amgut_props, adjust = "BH",
permTest = TRUE, nPerm = 100L,
fileLoadAssoPerm = "assoPerm",
seed = 123456)
# amgut_comp3 and amgut_comp4 should be equal
all.equal(amgut_comp3$adjaMatrices, amgut_comp4$adjaMatrices)
all.equal(amgut_comp3$properties, amgut_comp4$properties)
summary(amgut_comp2)
summary(amgut_comp3)
summary(amgut_comp4)
#--------------------------
# Use 'createAssoPerm' to create "permuted" count and association matrices
createAssoPerm(amgut_props, nPerm = 100,
computeAsso = TRUE,
fileStoreAssoPerm = "assoPerm",
storeCountsPerm = TRUE,
fileStoreCountsPerm = c("countsPerm1", "countsPerm2"),
append = FALSE, seed = 123456)
amgut_comp5 <- netCompare(amgut_props, permTest = TRUE, nPerm = 100L,
fileLoadAssoPerm = "assoPerm")
all.equal(amgut_comp3$properties, amgut_comp5$properties)
summary(amgut_comp5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.