calcGCD | R Documentation |
Computes the Graphlet Correlation Distance (GCD) - a graphlet-based distance measure - between two networks.
Following Yaveroglu et al. (2014), the GCD is defined as the Euclidean distance of the upper triangle values of the Graphlet Correlation Matrices (GCM) of two networks, which are defined by their adjacency matrices. The GCM of a network is a matrix with Spearman's correlations between the network's node orbits (Hocevar and Demsar, 2016).
The function considers only orbits for graphlets with up to four nodes.
Orbit counts are determined using the function count4
from orca
package.
Unobserved orbits would lead to NAs in the correlation matrix, which is
why a row with pseudo counts of 1 is added to the orbit count matrices
(ocount1
and ocount2
).
The function is based on R code provided by Theresa Ullmann (https://orcid.org/0000-0003-1215-8561).
calcGCD(adja1, adja2, orbits = c(0, 2, 5, 7, 8, 10, 11, 6, 9, 4, 1))
adja1 , adja2 |
adjacency matrices (numeric) defining the two networks between which the GCD shall be calculated. |
orbits |
numeric vector with integers from 0 to 14 defining the graphlet orbits to use for GCD calculation. Minimum length is 2. Defaults to c(0, 2, 5, 7, 8, 10, 11, 6, 9, 4, 1), thus excluding redundant orbits such as the orbit o3. See details. |
By default, only the 11 non-redundant orbits are used. These are grouped according to their role: orbit 0 represents the degree, orbits (2, 5, 7) represent nodes within a chain, orbits (8, 10, 11) represent nodes in a cycle, and orbits (6, 9, 4, 1) represent a terminal node.
An object of class gcd
containing the following elements:
gcd | Graphlet Correlation Distance between the two networks |
ocount1, ocount2 | Orbit counts |
gcm1, gcm2 | Graphlet Correlation Matrices |
hocevar2016computationNetCoMi
\insertRefyaveroglu2014revealingNetCoMi
calcGCM
, testGCM
library(phyloseq)
# Load data sets from American Gut Project (from SpiecEasi package)
data("amgut2.filt.phy")
# Split data into two groups: with and without seasonal allergies
amgut_season_yes <- phyloseq::subset_samples(amgut2.filt.phy,
SEASONAL_ALLERGIES == "yes")
amgut_season_no <- phyloseq::subset_samples(amgut2.filt.phy,
SEASONAL_ALLERGIES == "no")
# Make sample sizes equal to ensure comparability
n_yes <- phyloseq::nsamples(amgut_season_yes)
ids_yes <- phyloseq::get_variable(amgut_season_no, "X.SampleID")[1:n_yes]
amgut_season_no <- phyloseq::subset_samples(amgut_season_no, X.SampleID %in% ids_yes)
# Network construction
net <- netConstruct(amgut_season_yes,
amgut_season_no,
filtTax = "highestFreq",
filtTaxPar = list(highestFreq = 50),
measure = "pearson",
normMethod = "clr",
zeroMethod = "pseudoZO",
sparsMethod = "thresh",
thresh = 0.5)
# Get adjacency matrices
adja1 <- net$adjaMat1
adja2 <- net$adjaMat2
# Network visualization
props <- netAnalyze(net)
plot(props, rmSingles = TRUE, cexLabels = 1.7)
# Calculate the GCD
gcd <- calcGCD(adja1, adja2)
gcd
# Orbit counts
head(gcd$ocount1)
head(gcd$ocount2)
# GCMs
gcd$gcm1
gcd$gcm2
# Test Graphlet Correlations for significant differences
gcmtest <- testGCM(gcd)
### Plot heatmaps
# GCM 1 (with significance code in the lower triangle)
plotHeat(gcmtest$gcm1,
pmat = gcmtest$pAdjust1,
type = "mixed")
# GCM 2 (with significance code in the lower triangle)
plotHeat(gcmtest$gcm2,
pmat = gcmtest$pAdjust2,
type = "mixed")
# Difference GCM1-GCM2 (with p-values in the lower triangle)
plotHeat(gcmtest$diff,
pmat = gcmtest$pAdjustDiff,
type = "mixed",
textLow = "pmat")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.