dNetFind is supposed to find the maximum scoring subgraph from
an input graph and scores imposed on its nodes. The input graph and the
output subgraph are both of "igraph" or "graphNET" object. The input
scores imposed on the nodes in the input graph can be divided into two
parts: the positive nodes and the negative nodes. The searching for
maximum scoring subgraph is deduced to find the connected subgraph
containing the positive nodes as many as possible, but the negative
nodes as few as possible. To this end, a heuristic search is used (see
an object of class "igraph" or "graphNEL"
a vector of scores. For each element, it must have the name that could be mapped onto the input graph. Also, the names in input "scores" should contain all those in the input graph "g", but the reverse is not necessary
a subgraph with a maximum score, an object of class "igraph" or "graphNEL". It has node attributes 'score' and 'type' (either 'desired' or 'linker')
The search procedure is heuristic to find the subgraph with the maximum score:
i) transform the input graph into a new graph by collapsing connected positive nodes into a meta-node. As such, meta-nodes are isolated to each other but are linked via negative nodes (single-nodes). Clearly, meta-nodes have positive scores, and negative scores for the single-nodes.
ii) append the weight attribute to the edges in the transformed graph. There are two types of edges: 1) the single-single edge with two single-nodes as two ends, and 2) single-meta edge with a single-node as one end and a meta-node as the other end. The weight for a single-single edge is the absolute sum of the scores in its two-end single-nodes but normalised by their degrees. The weight for a single-meta edge is simply the absolute score in its single-node end normalised by the degree. As such, weights are all non-negative.
iii) find minimum spanning tree (MST) in the weighted transformed graph using Prim's greedy algorithm. A spanning tree of the weighted graph is a subgraph that is tree and connects all the node together. The MST is a spanning tree with the sum of its edge weights minimised amongst all possible spanning trees.
iv) find all shortest paths between any pair of meta-nodes in the MST. Within the weighted transformed graph in ii), a subgraph is induced containing nodes (only occuring in these shortest paths) and all edges between them.
v) within the induced subgraph, identify single-nodes that are direct neighbors of meta-nodes. For each of these single-nodes, also make sure it has the absolute scores no more than the sum of scores in its neighboring meta-nodes. These single-nodes meeting both criteria are called "linkers".
vi) still within the induced subgraph in v), find the linker graph that contains only linkers and edges between them. Similarly to iii), find MST of the linker graph, called 'linker MST'. Notably, this linker MST serves as the scaffold, which only contains linkers but has meta-nodes being direcly attached to.
vii) in linker MST plus its attached meta-nodes, find the optimal path that has the sum of scores of its nodes and attached meta-nodes maximised amongest all possible paths. Nodes along this optimal path plus their attached meta-nodes are called 'subgraph nodes'.
viii) finally, from the input graph extract a subgraph (called 'subgraph') that only contains subgraph nodes and edges betwen them. This subgraph is the maximum scoring subgraph containing the positive nodes as many as possible, but the negative nodes as few as possible.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# 1) generate an vector consisting of random values from beta distribution x <- rbeta(1000, shape1=0.5, shape2=1) # 2) fit a p-value distribution under beta-uniform mixture model fit <- dBUMfit(x, ntry=1, hist.bum=FALSE, contour.bum=FALSE) # 3) calculate the scores according to the fitted BUM and fdr=0.01 # using "pdf" method scores <- dBUMscore(fit, method="pdf", fdr=0.05, scatter.bum=FALSE) names(scores) <- as.character(1:length(scores)) # 4) generate a random graph according to the ER model g <- erdos.renyi.game(1000, 1/100) # 5) produce the induced subgraph only based on the nodes in query subg <- dNetInduce(g, V(g), knn=0) # 6) find the subgraph with the maximum score subgraph <- dNetFind(subg, scores)
Loading required package: igraph Attaching package: 'igraph' The following objects are masked from 'package:stats': decompose, spectrum The following object is masked from 'package:base': union Loading required package: supraHex Loading required package: hexbin A total of p-values: 1000 Maximum Log-Likelihood: 352.1 Mixture parameter (lambda): 0.000 Shape parameter (a): 0.479
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.