net_query: Network querying method based on conditional random fields

View source: R/net_query.R

net_queryR Documentation

Network querying method based on conditional random fields

Description

Find the best matching subnetworks from a large target network for small query networks based on the conditional random fields (CRF) model.

Usage

net_query(
  query.net,
  target.net,
  node.sim,
  query.type = 4,
  delta.d = 1e-10,
  delta.c = 0.5,
  delta.e = 1,
  delta.s = 1,
  output = "result.txt"
)

net_query_batch(
  query.nets,
  target.net,
  node.sim,
  query.type = 4,
  delta.d = 1e-10,
  delta.c = 0.5,
  delta.e = 1,
  delta.s = 1,
  output = "result.txt"
)

Arguments

query.net

The input file name of the query network.

target.net

The input file name of the target network.

node.sim

The input file name of the node similarity scores between the query network and the target network.

query.type

The querying network type: 1 - general, 2 - chain, 3 - tree, 4 - heuristic.

delta.d

The parameter delta.d is a parameter for deletions.

delta.c

The parameter delta.c is a parameter for consecutive deletions.

delta.e

The parameter delta.e is a parameter for single deletion.

delta.s

The parameter delta.s is a parameter for insertions.

output

The suffix of output file name.

query.nets

The vector of input file names of the query networks.

Details

This is an approach for network querying problem based on conditional random field (CRF) model which can handle both undirected and directed networks, acyclic and cyclic networks, and any number of insertions/deletions.

When querying several networks in the same target network, net_query_batch will save much time.

  • query.net: The query network file is written as follows:
    v1 v2 v3 v4 v5
    v3 v4
    ...
    where v1, v2, v3, v4, v5 ... are the nodes' names and each line indicates there are edges between the first node and other nodes in the line. For example, the first line denotes 4 edges: (v1, v2), (v1, v3), (v1, v4), and (v1, v5).

  • target.net: The format of this file is the same as the query network file.

  • node.sim: This similarity file's format is as follows:
    v1 V1 s1
    v1 V2 s2
    ...
    v1 is the node from the query network, V1 is the node from the target network, s1 is the similarity score between the node v1 and V1, and so on.

  • query.type: If query.type = 1, the loopy belief propagation (LBP) algorithm will be applied, which is an approximate algorithm for a general graph with loops. If the query is a chain or tree, there are exact algorithms. Set query.type = 2 when the query is a chain, and query.type = 3 when the query is a tree. The heuristic algorithm will be used when query.type = 4, which will try the exact algorithm (junction tree algorithm) first and resort to LBP algorithm when the exact algorithm failed. The default value is 4.

  • delta.d: The smaller delta.d is, the heavier penalty for deletions.

  • delta.c: The smaller delta.c is, the heavier penalty for consecutive deletions.

  • delta.e: The smaller delta.e is, the heavier penalty for single deletion.

  • delta.s: The larger delta.s indicates heavier penalty for insertions.

References

Qiang Huang, Ling-Yun Wu, and Xiang-Sun Zhang. An Efficient Network Querying Method Based on Conditional Random Fields. Bioinformatics, 27(22):3173-3178, 2011.

Examples


## Not run: 
library(Corbi)

## An example: "querynet.txt", "targetnet.txt", "nodesim.txt" are
## three input files in the working directory
net_query("querynet.txt", "targetnet.txt", "nodesim.txt", query.type=3)

## End(Not run)


## Not run: 
## Batch example
net_query_batch(c("querynet.txt", "querynet2.txt"),
  "targetnet.txt", "nodesim.txt", query.type=3)

## End(Not run)


Corbi documentation built on May 3, 2022, 3:01 a.m.