Measure the strength of association between a phenotype and a network by computing the strength of hit clustering on the network
Description
Compute the strength of clustering of highweight vertices (hits) on a network using a modified version of Ripley's Kstatistic. This method can be used to measure the strength of association between a phenotype or function and a network.
Usage
1 2 3 
Arguments
g 

nperm 
Integer value, the number of permutations to be completed. 
dist.method 
String, the method used to calculate the distance between vertex pairs. 
vertex.attr 
Character vector, the name of the vertex attributes under which the vertex weights to be tested are stored. The vector can contain one or more elements. 
edge.attr 
String, the name of the edge attribute to be used as distances along the edges. If an edge attribute with this name is not found, then each edge is assumed to have a distance of 1. 
correct.factor 
Numeric value, if the network contains unconnected vertices, then the distance between these vertices is set as the maximum distance between the connected vertices multiplied by 
nsteps 
Integer value, the number of bins into which vertex pairs are placed. 
prob 
Numeric vector, the quantiles to be calculated for the 
parallel 
Numeric value or 
B 
Symmetrical numeric matrix. A precomputed distance bin matrix for 
verbose 
Logical, if 
Details
The SANTA method uses the 'guiltbyassociation' principle to measure the strength of association between a network and a phenotype. It does this by measuring the strength of clustering of the phenotype scores across the network. The stronger the clustering, the greater the association between the network and the phenotype.
The SANTA method applies Ripley's Kfunction, a wellestablished approach to spatial statistics that measures the strength of clustering of points on a plane, and extends it in a number of ways. First, a Knet function is defined by adapting the approach for networks using vertex pair distance measures. Second, vertex weights are incorporated into Knet and the importance of vertices made relative to their own associated weight. Third, the mean vertex weight is subtracted from each individual vertex weight when calculating the Knet function. This means that the Knet function measures the degree of vertex weight clustering relative to a random distribution of vertex weights. The Knet function is defined as
K^{net}[s] = 2/p^2 * sum_i{(p_i) sum_j{(pj  bar{p}) I(dg(i,j)<=s)}}
where p_i is the weight of vertex i, \bar{p} is the mean vertex weight across all vertices, and I(dg[i,j]<=s) is an identity function, equaling 1 if vertex i and vertex j are within distance s and 0 otherwise.
In order to derive a pvalue and quantify the significance of the observed distribution of weights, the observed Knetcurve is compared to Knetcurves obtained using the same network but randomly permuted vertex weights. Vertices with missing weights (NA
) are not included within these permutations. The area under the Knetcurve (AUK) is calculated for the observed network and each of the permuted networks and a zscore used to produce a pvalue. This pvalue indicates the probability an observed AUK at least this high is seen given the null hypothesis that the vertex weights are randomly distributed.
If parallel computing is possible, parallel
can be used to split permutations over multiple cores. The snow
package is used to manage the parallel computing. If parallel=NULL
or parallel computing is not possible, then only one core is used. If a positive integer is input and parallel computing is possible, then the permutations are split over up to this many cores.
Vertex weights should be greater or equal that zero or equal to NA if the weight is missing.
Value
If one vertex attribute is input, Knet
is run on the single set of vertex weights and a list containing the statistics below is returned. If more than one vertex attribute is input, then Knet
is run on each set of vertex weights and a list containing an element for each vertex attribute is returned. Each element contains a sublist containing the statistics below for the relevant vertex attribute.
K.obs 
Knetfunction curve for the observed vertex weights. 
AUK.obs 
Area under the Knetfunction curve (AUK) for the observed vertex weights. 
K.perm 
Knetfunction curve for each permutation of vertex weights. Equals 
AUK.perm 
Area under the Knetfunction curve (AUK) for each permutation of vertex weights. 
K.quan 
Quantiles for the permuted Knetfunction curves. 
pval 
pvalue, calculated from a zscore derived from the observed and permuted AUKs. 
Author(s)
Alex J. Cornish a.cornish12@imperial.ac.uk and Florian Markowetz
References
Cornish, A.J. and Markowetz, F. (2014) SANTA: Quantifying the Functional Content of Molecular Networks.. PLOS Computational Biology. 10:9, e1003808.
Okabe, A. and Yamada, I. (2001). The Kfunction method on a network and its computational implementation Geographical Analysis. 33(3): 271290.
See Also
Knode
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13  # apply Knet to a network with hit clustering
g.clustered < barabasi.game(50, directed=FALSE)
g.clustered < SpreadHits(g.clustered, h=10, lambda=10)
res.clustered < Knet(g.clustered, nperm=100, vertex.attr="hits")
res.clustered$pval
plot(res.clustered)
# apply Knet to a network without hit clustering
g.unclustered < barabasi.game(50, directed=FALSE)
g.unclustered < SpreadHits(g.unclustered, h=10, lambda=0)
res.unclustered < Knet(g.unclustered, nperm=100, vertex.attr="hits")
res.unclustered$pval
plot(res.unclustered)
