\pagebreak
To run and utilize all functions of RMut package, three following installations should be conducted in sequence:
Core algorithms of RMut package were written in Java, thus a Java SE Development Kit (JDK) is required to run the package. The JDK is available at:
http://www.oracle.com/technetwork/java/javase/downloads/index.html.
Two following kinds of JDK can be used alternatively:
Old series
Java SE 8u201 / Java SE 8u202 or higher version
New series
Java SE 11.0.2(LTS) or higher version
Firstly, the devtools package must be installed by typing the following commands into the R console:
> install.packages("devtools")
More details about the devtools package could be found in the website https://github.com/r-lib/devtools.
Make sure you have Java Development Kit installed and correctly registered in R. If in doubt, run the command R CMD javareconf as root or administrator permission.
Next, the RMut package should be properly installed into the R environment by typing the following commands:
> install.packages("rJava")
> devtools::install_github("csclab/RMut", INSTALL_opts="--no-multiarch")
We note that the new version of devtools package uses the keyword INSTALL_opts to specify additional installation options instead of the old keyword args. Though all of core algorithms written in Java, the rJava package must be installed in the R environment before the RMut installation. After installation, the RMut package can be loaded via
> library(RMut)
In addition, we must initialize the Java Virtual Machine (JVM) with a Maximum Java heap size via the function initJVM. This function must be called before any RMut functions can be used. The following command will initialize the JVM with the maximum Java heap size of 8GB (in case of large-scale networks analysis, we could set the Java heap size to a larger value):
> initJVM("8G")
In order to utilize the full computing power of multi-core central processing units (CPUs) and graphics processing units (GPUs), OpenCL drivers should be installed into your system. Here are necessary steps for a system with:
NVIDIA graphics cards
OpenCL support is included in the latest drivers, in the driver CD or available at
AMD graphics cards
The OpenCL GPU runtime library is included in the drivers of your AMD cards. The drivers could be in the driver CD or available at
CPU devices only (No graphics cards)
At the time of developing this R package, CPU devices from AMD are no longer supported as OpenCL device. For Intel CPU devices, the OpenCL runtime library is available at:
After installation, OpenCL information can be outputed via the function showOpencl. Then we can enable OpenCL computation on a CPU/GPU device via the function setOpencl:
library(RMut) initJVM("8G") showOpencl() setOpencl("gpu")
The above functions show installed OpenCL platforms with their corresponding CPU/GPU devices, and try to select an graphics card for OpenCL computing.
Networks can be loaded in two ways using RMut:
The loadNetwork function creates a network from a Tab-separated values text file. The file format contains three columns:
The function returned a network object which contains:
Here is an example:
amrn <- loadNetwork("networks/AMRN.sif") print(amrn)
Finally, the loaded network object amrn has five components:
nodes: a data frame which initially contains one column for node identifiers.
In this example network, there exists 10 nodes. Additional columns for other node-based attributes would be inserted later.
edges: a data frame which initially contains one column for edge identifiers.
In this example, there exists 22 edges. Additional columns for other edge-based attributes would be inserted later.
network: a data frame which initially contains one column for the network identifier (AMRN.sif in this case).
Additional columns for other network-based attributes would be inserted later, such as total number of feedback/feed-forward loops.
transitionNetwork: a Boolean variable denotes whether the network is a transition network or not, in this case the value is FALSE.
The findAttractors function returns a transition network object in which the transitionNetwork variable has a value TRUE. For all other cases, the variable has a value FALSE.
In addition, the package provides some example networks that could be simply loaded by data command. For ex.,
data(amrn)
The package supplied four example datasets from small-scale to large-scale real biological networks:
amrn
The Arabidopsis morphogenesis regulatory network (AMRN) with 10 nodes and 22 links. * cdrn
The cell differentiation regulatory network (CDRN) with 9 nodes and 15 links. * cchs
The cell cycle pathway of the species Homo sapiens (CCHS) with 161 nodes and 223 links. * ccsn
The canonical cell signaling network (CCSN) with 771 nodes and 1633 links. * hsn
The large-scale human signaling network (HSN) with 1192 nodes and 3102 links.
All original network files (Tab-separated values text files) could be downloaded in the folder vignettes/networks of the RMut website https://github.com/csclab/RMut.
A user could retrieve pathways in WikiPathways database (https://www.wikipathways.org) as a SIF file by the wikiPathways plugin of the Cytoscape software. The version of Cytoscape should be greater than or equal 3.6.1.
Firstly, the pathway could be loaded into Cytoscape by some steps indicated in the Figure 1 and 2.
After that, we select the "Edge Table" tab and detach it for easy modification (Figure 3).
There does not exist relationship types in the attribute or column interaction (activation, inhibition, or neutral), thus we must update them based on some existing columns as follows:
activation interaction (value is 1)
In case at least one of the corresponding columns WP.type or Source Arrow Shape has the value "mim-conversion" or "Arrow".
inhibition interaction (value is -1)
In case at least one of the corresponding columns WP.type or Source Arrow Shape has the value "mim-inhibition" or "TBar".
neutral interaction (value is 0)
In case both the corresponding columns WP.type and Source Arrow Shape has the value "Line", or the corresponding column WP.type is empty.
For each type of interaction, we select the rows or interactions that satisfy the above conditions, and then modify the values of the column interaction as a way like Figure 4.
To repeat this step for other types, we deselect edges by clicking in the empty space of the network visualization panel. Finally, we export the pathway to SIF file format by the following menu: File | Export | Network... . We might need to remove wrong rows of interactions (missing the interaction type) in the SIF file by a spreadsheet software like Microsoft Excel (Figure 5).
The package utilizes a Boolean network model with synchronous updating scheme, and provides two types of useful analyses of Boolean dynamics in real biological networks or random networks:
Via calSensitivity function, this package computes nodal/edgetic sensitivity against many types of mutations in terms of Boolean dynamics. We classified ten well-known mutations into two types (refer to RMut paper for more details):
Node-based mutations: state-flip, rule-flip, outcome-shuffle, knockout and overexpression
Edgetic mutations: edge-removal, edge-attenuation, edge-addition, edge-sign-switch, and edge-reverse
Two kinds of sensitivity measures are computed: macro-distance and bitwise-distance sensitivity measures. In addition, we note that multiple sets of random Nested Canalyzing rules could be specified, and thus resulted in multiple sensitivity values for each node/edge. Here, we show an example of some sensitivity types:
data(amrn) # generate all possible initial-states each containing 10 Boolean nodes set1 <- generateStates(10, "all") # generate all possible groups each containing a single node in the AMRN network amrn <- generateGroups(amrn, "all", 1, 0) amrn <- calSensitivity(amrn, set1, "rule flip", numRuleSets = 2) print(amrn$Group_1) # generate all possible groups each containing a single edge in the AMRN network amrn <- generateGroups(amrn, "all", 0, 1) amrn <- calSensitivity(amrn, set1, "edge removal") print(amrn$Group_2) # generate all possible groups each containing a new edge (not exist in the AMRN network) amrn <- generateGroups(amrn, "all", 0, 1, TRUE) amrn <- calSensitivity(amrn, set1, "edge addition") print(amrn$Group_3)
As shown above, we firstly need to generate a set of initial-states by the function generateStates. Then by the function generateGroups, we continue to generate three sets of node/edge groups whose their sensitivity would be calculated. Finally, the sensitivity values are stored in the same data frame of node/edge groups. The data frame has one column for group identifiers (lists of nodes/edges), and some next columns containing their sensitivity values according to each set of random update-rules. For example, the mutation rule-flip used two sets of Nested Canalyzing rules, thus resulted in two corresponding sets of sensitivity values. RMut automatically generates a file of Boolean logics for each set, or uses existing files in the working directory of RMut. Here, two rule files "AMRN_rules_0" and "AMRN_rules_1" are generated. A user can manually create or modify these rule files before the calculation. In addition, the column names which contain the sequence "macro" or "bitws" denote the macro-distance and bitwise-distance sensitivity measures, respectively.
Via findAttractors function, the landscape of the network state transitions along with attractor cycles would be identified. The returned transition network object has same structures with the normal network object resulted from loadNetwork function (see section "loadNetwork function"). An example is demonstrated as follows:
data(amrn) # generate all possible initial-states each containing 10 Boolean nodes set1 <- generateStates(10, "all") # generate a set of only conjunction rules generateRule(amrn) transNet <- findAttractors(amrn, set1) # print some first network states head(transNet$nodes) # print some first transition links between network states head(transNet$edges) output(transNet)
As shown in the example, there exists some different points inside two nodes/edges's data frames of the transNet object compared to those of normal network objects:
nodes:
The first column is also used for node identifiers, but in this case they represent states of the analyzed network amrn. There exists 1024 nodes which are equivalent to 1024 network states of amrn.
Additional columns are described as follows: Attractor: value 1 denotes the network state belongs to an attractor, otherwises 0. NetworkState: specifies the network state of the node.
edges:
The first column is also used for edge identifiers, but in this case they represent transition links of the analyzed network amrn. Each edge identifier has a string (1) which denotes a directed link between two node identifiers. There exists 1024 edges which are equivalent to 1024 transition links of amrn.
Additional columns are described as follows: * Attractor: value 1 means that the transition link connects two network states of an attractor, otherwises 0.
We take the node N6 as an example. Its corresponding network state is 0000000101 which represents Boolean values of all nodes in alphabetical order of the analyzed network amrn:
data(amrn) amrn <- findFBLs(amrn) for(i in 1:length(amrn$nodes$NodeID)) { s <- format(amrn$nodes$NodeID[i], width=8) cat(s) } cat("\n") state <- "0000000101" a <- strsplit(state, split="") a <- unlist(a) for(i in 1:length(a)) { s <- format(a[i], width=8) cat(s) }
Moreover, the Attractor value 1 means that N6 belongs to an attractor. And the data frame edges also shows a transition link N6 (1) N6 with Attractor value 1. It means that N6 (1) N6 is a fixed point attractor.
Finally, the resulted transition network could be exported by the function output (see section "Export results"). Three CSV files were outputed for the transition network itself and nodes/edges attributes with the following names: AMRN_trans.sif, AMRN_trans_out_nodes.csv and AMRN_trans_out_edges.csv, respectively. Then, those resulted files could be further loaded and analyzed by other softwares with powerful visualization functions like Cytoscape. For more information on Cytoscape, please refer to http://www.cytoscape.org/. In this tutorial, we used Cytoscape version 3.4.0.
The transition network is written as a SIF file (*.sif). The SIF file could be loaded to Cytoscape with the following menu:
File | Import | Network | File... or using the shortcut keys Ctrl/Cmd + L (Figure 6(a))
In next steps, we import two CSV files of nodes/edges attributes via File | Import | Table | File... menu (Figure 6(b)). For the nodes attributes file, we should select String data type for the column NetworkState (Figure 7). For the edges attributes file, we must select Edge Table Columns in the drop-down list beside the text Import Data as: (Figure 8).
After importing, we select Style panel and modify the node and edge styles a little to highlight all attractor cycles. For node style, select Red color in Fill Color property for the nodes that belong to an attractor (Figure 9(a)). Regards to edge style, select Red color in Stroke Color property and change Width property to a larger value (optional) for the edges that connect two states of an attractor (Figure 9(b)).
As a result, Figure 10 shows the modified transition network with clearer indication of attractor cycles.
Via findFBLs and findFFLs, the package supports methods of searching feedback/feed-forward loops (FBLs/FFLs), respectively, for all nodes/edges in a network. The following is an example R code for the search:
data(amrn) # search feedback/feed-forward loops amrn <- findFBLs(amrn, maxLength = 10) amrn <- findFFLs(amrn) print(amrn$nodes) print(amrn$edges) print(amrn$network)
In the above output, some abbreviations in the two nodes/edges data frames are explained as follows (refer to the literature [3-4] in the References section for more details):
NuFBL: number of feedback loops involving the node/edge
NuPosFBL, NuNegFBL: number of positive and negative feedback loops, respectively, involving the node/edge
NuFFL: number of feed-forward loops involving the node/edge
NuFFL_A, NuFFL_B and NuFFL_C: number of feed-forward loops with role A, B and C, respectively, involving the node
NuFFL_AB, NuFFL_BC and NuFFL_AC: number of feed-forward loops with role AB, BC and AC, respectively, involving the edge
In the network data frame, NuFBL, NuPosFBL, NuNegFBL, NuFFL, NuCoFFL and NuInCoFFL denote total numbers of FBLs, positive/negative FBLs, FFLs and coherent/incoherent FFLs in the network, respectively.
The calCentrality function calculates node-/edge-based centralities of a network such as Degree, In-/Out-Degree, Closeness, Betweenness, Stress, Eigenvector, Edge Degree and Edge Betweenness. An example is demonstrated as follows:
data(amrn) # calculate node-/edge-based centralities amrn <- calCentrality(amrn) print(amrn$nodes) print(amrn$edges)
Via output function, all examined attributes of the networks and their nodes/edges will be exported to CSV files. The structure of these networks are also exported as Tab-separated values text files (.SIF extension). The following is an example R code for the output:
data(amrn) # generate all possible initial-states each containing 10 Boolean nodes set1 <- generateStates(10, "all") # generate all possible groups each containing a single node in the AMRN network amrn <- generateGroups(amrn, "all", 1, 0) amrn <- calSensitivity(amrn, set1, "knockout") # search feedback/feed-forward loops amrn <- findFBLs(amrn, maxLength = 10) amrn <- findFFLs(amrn) # calculate node-/edge-based centralities amrn <- calCentrality(amrn) # export all results to CSV files output(amrn)
The methods of dynamics and structure analysis described in the above sections (except the findAttractors function due to memory limitation) could also be applied to a set of networks, not limited to a single network. The RMut package provides the createRBNs function to generate a set of random networks using a generation model from among four models (refer to the literature in the References section for more details):
Barabasi-Albert (BA) model [1]
Erdos-Renyi (ER) variant model [2]
Two shuffling models (Shuffle 1 and Shuffle 2) [3]
Here, we show two examples of generating a set of random networks and analyzing dynamics-related sensitivity and structural characteristic of those networks:
Example 1
# Example 1: generate random networks based on BA model # ######################################################### # generate all possible initial-states each containing 10 Boolean nodes set1 <- generateStates(10, "all") # generate two random networks based on BA model ba_rbns <- createRBNs("BA_RBN_", 2, "BA", 10, 17) # for each random network, generate all possible groups each containing a single node ba_rbns <- generateGroups(ba_rbns, "all", 1, 0) # for each random network, calculate the sensitivity values of all nodes against "knockout" mutation ba_rbns <- calSensitivity(ba_rbns, set1, "knockout") # for each random network, calculate structural measures of all nodes/edges ba_rbns <- findFBLs(ba_rbns, maxLength = 10) ba_rbns <- findFFLs(ba_rbns) ba_rbns <- calCentrality(ba_rbns) print(ba_rbns) output(ba_rbns)
Example 2
# Example 2: generate random networks based on "Shuffle 2" model # ################################################################## data(amrn) # generate all possible initial-states each containing 10 Boolean nodes set1 <- generateStates(10, "all") # generate two random networks based on "Shuffle 2" model amrn_rbns <- createRBNs("AMRN_RBN_", 2, "shuffle 2", referedNetwork = amrn) # for each random network, generate all possible groups each containing a single edge amrn_rbns <- generateGroups(amrn_rbns, "all", 0, 1) # for each random network, calculate the sensitivity values of all edges against "remove" mutation amrn_rbns <- calSensitivity(amrn_rbns, set1, "edge removal") # for each random network, calculate structural measures of all nodes/edges amrn_rbns <- findFBLs(amrn_rbns, maxLength = 10) amrn_rbns <- findFFLs(amrn_rbns) amrn_rbns <- calCentrality(amrn_rbns) print(amrn_rbns) output(amrn_rbns)
Barabasi A-L, Albert R (1999) Emergence of Scaling in Random Networks. Science 286: 509-512. doi: 10.1126/science.286.5439.509
Le D-H, Kwon Y-K (2011) NetDS: A Cytoscape plugin to analyze the robustness of dynamics and feedforward/feedback loop structures of biological networks. Bioinformatics.
Trinh H-C, Le D-H, Kwon Y-K (2014) PANET: A GPU-Based Tool for Fast Parallel Analysis of Robustness Dynamics and Feed-Forward/Feedback Loop Structures in Large-Scale Biological Networks. PLoS ONE 9: e103010.
Koschutzki D, Schwobbermeyer H, Schreiber F (2007) Ranking of network elements based on functional substructures. Journal of Theoretical Biology 248: 471-479.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.