perm.test.nw: Permutation-based test for differences between two networks

perm.test.nwR Documentation

Permutation-based test for differences between two networks

Description

This function provides a permutation-based frame for testing for differences between two networks. In particular, various (i) network estimation methods and (ii) network difference characteristics can be specified.

Usage

perm.test.nw(
  A,
  B,
  permnum,
  methodlist,
  thresh = 0.05,
  score.funct,
  paired = FALSE
)

Arguments

A, B

input data tables from which the adjacency matrices will be generated, to be provided in form of matrices, arrays, data frames or tibbles; need to have the same number of columns (corresponding to the number of nodes)

permnum

a number, specifying the number of permutations

methodlist

a list specifying the method which is used to estimate and create the adjacency matrices; see details for possible options and further information

thresh

a number between 0 and 1 (default is set to 0.05) specifying the singificance level: if the p-value corresponding to an edge weight is greater than thresh, the corresponding edge weight is not considered to be significant and thus set to zero

score.funct

the function used to compare the adjacency matrices A and B; see details for possible options and further information

paired

Boolean, specifying whether the data underlying the two networks is paired or not

Details

This function provides a permutation-based frame for testing for differences between two networks. In particular, various (i) network estimation methods and (ii) network difference characteristics can be specified.
(i) The network estimation method has to be specified in form of a list in the methodlist argument. Currently, the following estimation methods are supported:

  • list("Spearman")
    Edge weights are estimated using Spearman correlation, where unadjusted p-values are employed to determine significance. To apply this method, the expression list("Spearman") has to be provided in the methodlist argument.

  • list("Spearman.adj",adjustment method)
    Edge weights are estimated using Spearman correlation, where p-values adjusted for multiple testing are employed to determine significance. To apply this method, the expression list("Spearman.adj",adjustment method) has to be provided in the methodlist argument, where adjustment method has to be one of the options for multple testing adjustment provided by the standard p.adjust R function, i.e. one of "BH", "bonferroni", "BY", "fdr", "hochberg", "holm" or "hommel".

  • list("PCSpearman")
    Edge weights are estimated using partial Spearman correlation, where unadjusted p-values are employed to determine significance. To apply this method, the expression list("PCSpearman") has to be provided in the methodlist argument.

  • list("PCSpearman.adj",adjustment method)
    Edge weights are estimated using partial Spearman correlation, where p-values adjusted for multiple testing are employed to determine significance. To apply this method, the expression list("PCSpearman.adj",adjustment method) has to be provided in the methodlist argument, where adjustment method has to be one of the options for multple testing adjustment provided by the standard p.adjust R function, i.e. one of "BH", "bonferroni", "BY", "fdr", "hochberg", "holm" or "hommel".

  • list("DistCorr")
    Edge weights are estimated using distance correlation, where unadjusted p-values are employed to determine significance. To apply this method, the expression list("DistCorr") has to be provided in the methodlist argument. Note that the calculations may require larger computation times, as a permutation test is involved to derive the corresponding p-values for the distance correlations.

  • list("DistCorr.adj",adjustment method)
    Edge weights are estimated using distance correlation, where p-values adjusted for multiple testing are employed to determine significance. To apply this method, the expression list("DistCorr.adj",adjustment method) has to be provided in the methodlist argument, where adjustment method has to be one of the options for multple testing adjustment provided by the standard p.adjust R function, i.e. one of "BH", "bonferroni", "BY", "fdr", "hochberg", "holm" or "hommel". Note that the calculations may require larger computation times, as a permutation test is involved to derive the corresponding p-values for the distance correlations.

  • list("EBICglasso",correlation type,tuning parameter)
    Edge weights are estimated using the EBICglasso approach. To apply this method, the expression list("EBICglasso",correlation type,tuning parameter) has to be provided in the methodlist argument. Here, correlation type has to be one of the association options provided by the standard cor R function, i.e. one of "kendall", "pearson" or "spearman". Moreover, tuning parameter has to be a number specifying the EBIC tuning parameter γ. Typical choices include values between 0 and 0.5, where smaller values usually lead to a higher sensitivity in that more edges are included into the network.
    Note that for EBICglasso, an additional specification of the thresh argument is obsolete, as it is not used for the application of the method.

(ii) To quantify differences between two networks, the following (a) overall, (b) edge-specific and (c) node-specific network difference characteristics, which have to be supplied in the score.funct argument, are currently supported:

  • frobenius.metric (overall)
    Calculates the Frobenius metric between two networks

  • global.str (overall)
    Calculates the difference in global strength between two networks

  • maximum.metric (overall)
    Calculates the maximum metric between two networks

  • number.differences (overall)
    Calculates the differences in numbers of edges, clusters and isolated nodes between two networks

  • spec.dist (overall)
    Calculates the spectral distance between two networks

  • jaccard.dist (overall)
    Calculates the Jaccard distance between two networks

  • betweenness.inv (node-specific)
    Calclulates the differences in betweenness between two networks for each node

  • closeness.inv (node-specific)
    Calculates the differences in closeness between two networks for each node

  • degree.inv (node-specific)
    Calculates the differences in degree between two networks for each node

  • eigen.inv (node-specific)
    Calculates the differences in eigenvector centrality between two networks for each node

  • edge.inv (edge-specific)
    Calculates the differences in absolute edge weights between two networks for each edge

  • edge.inv.direc (edge-specific)
    Calculates the differences in edge weights between two networks for each edge

Typically, a large number of permutations (e.g. 1000 or 10000) should be chosen in order to obtain reliable results. Note that a large number of permutations may lead to increasing computation times.

Note that when underlying data is paired (paired=TRUE), the input data tables need to have exactly the same dimension, i.e. the same number of columns (nodes) AND rows (samples).

Value

a list, whose specific structure depends on the specified network difference characteristic in the argument score.funct (with N denoting the number of nodes):

  • if score.funct is one of frobenius.metric, global.str, maximum.metric, spec.dist or jaccard.dist:
    a list with 6 elements: the adjacency matrix for input dat set A (N \times N matrix), the adjacency matrix for input dat set B (N \times N matrix), the value of the test statistic (vector of length 1), the values of the test statistics when applying the permutations (vector of length permnum), the p-value (vector of length 1), the p-value when inserting a pseudocount in the p-value calculation to avoid p-values that are exactly zero in permutation-based settings (vector of length 1)

  • if score.funct is number.differences:
    a list with 6 elements: output when applying create.graph to input data table A (list with 15 elements; see documentation of create.graph function for details), output when applying create.graph to input data table B (list with 15 elements; see documentation of create.graph function for details), the value of the test statistics (vector of length 6), the values of the test statistics when applying the permutations (permnum\times 6 matrix), the p-values (vector of length 6), the p-values when inserting a pseudocount in the p-value calculation to avoid p-values that are exactly zero in permutation-based settings (vector of length 6)

  • if score.funct is one of betweenness.inv, closeness.inv, degree.inv or eigen.inv:
    a list with 6 elements: the adjacency matrix for input dat set A (N \times N matrix), the adjacency matrix for input dat set B (N \times N matrix), the value of the test statistic for each node (vector of length N), the values of the test statistics for each node when applying the permutations (permnum\times N matrix), the p-value for each node (vector of length N), the p-value for each node when inserting a pseudocount in the p-value calculation to avoid p-values that are exactly zero in permutation-based settings (vector of length N)

  • if score.funct is one of edge.inv or edge.inv.direc:
    a list with 8 elements: the adjacency matrix for input dat set A (N \times N matrix), the adjacency matrix for input dat set B (N \times N matrix), the value of the test statistic for each node-node pair (N \times N matrix), the values of the test statistics for each node-node pair when applying the permutations (permnum\times N \times N array), the p-value for each node-node pair (N \times N matrix), the p-value for each node-node pair when inserting a pseudocount in the p-value calculation to avoid p-values that are exactly zero in permutation-based settings (N \times N matrix), a simplified overview of the p-values for the node-node pairs (\frac{N(N-1)}{2} \times 3 matrix; node-node pair in columns 1 and 2, corresponding p-value in column 3), a simplified overview of the p-values for the node-node pairs when inserting a pseudocount in the p-value calculation (\frac{N(N-1)}{2} \times 3 matrix; node-node pair in columns 1 and 2, corresponding p-value in column 3)

Examples

##examples using (a) overall network difference characteristics
res1<-perm.test.nw(A=ExDataA,B=ExDataB,permnum=10000,methodlist=list("PCSpearman"),
score.funct=frobenius.metric)
res2<-perm.test.nw(A=ExDataA,B=ExDataB,permnum=10000,methodlist=list("Spearman"),
score.funct=global.str,paired=TRUE)

##examples using (b) node-specific network difference characteristics
res3<-perm.test.nw(A=ExDataA,B=ExDataB,permnum=10000,methodlist=list("Spearman"),
thresh=0.1,score.funct=betweenness.inv,paired=TRUE)
res4<-perm.test.nw(A=ExDataA,B=ExDataB,permnum=10000,methodlist=list("EBICglasso",
"spearman",0.1), score.funct=degree.inv)

##examples using (c) edge-specific network difference characteristics
res5<-perm.test.nw(A=ExDataA,B=ExDataB,permnum=10000,methodlist=list("Spearman.adj",
"bonferroni"),score.funct=edge.inv)
res6<-perm.test.nw(A=ExDataA,B=ExDataB,permnum=10000,methodlist=list("EBICglasso",
"spearman",0.01),score.funct=edge.inv.direc,paired=TRUE)


RomanSchefzik/DNT documentation built on Sept. 11, 2022, 10:29 p.m.