mat2adj
is a high level function providing
different network inference methods. The function takes in input a data
matrix N by P, with N samples on the rows and P variables on the
columns. The adjacency matrix P by P will be computed with the
specified method, using N samples to infer the interactions between
the variables.
1 2 3 4 5 6 7 8 9 10 11 12 13 
x 
a matrix or data.frame of numerical values of N rows and P columns 
method 
a character string indicating which
method will be used for inferring a relationship between two variables. This must
be (an abbreviation of) one of 
P 
6 (default), integer used as softthresholding power for network
construction, used by the 
FDR 
1e3 (default), a number which indicates the number of
values generated to compute the NULL hypothesis. To be used for
methods 
measure 

alpha 
0.6 (default), the 
C 
15 (default), an integer value to be passed at the mine function
main. Only for methods 
DP 
1 (default), only for method 
... 
Additional arguments to be passed to the downstream functions. Normally the argument passed through ... are processed by the functions which compute the inference. Not all parameters are used by all functions. 
mat2adj
function is a highlevel function which includes
different methods for network inference. In particular the function
infer the relation between all the possible pairwaise comparison
between columns in the dataset. If the input is a data.frame
,
columns were first converted into a numerical matrix. Given a N by P
numerical matrix, the relation between each PxP pairs of
variables is inferred with the selected method.
The "FDR"
corrected methods are based on a permutation estimate
of the null hypothesis. A total amount of 1/("FDR"
)
permutations are performed to asses the reliability of the inferred
link; each link is set only if it
is inferred in all the permutations and its weight is lower then the
value on non permuted data. The default value for FDR
is 1e3.
All the available methods are the following:
cor
(default) computes the interaction using the
'Pearson' correlation coefficient. Different correlation methods, such
as Spearman
could be passed to the function using ....
ARACNE
Algorithm for the Reconstruction of Gene Regulatory Networks, see also package minet
CLR
Context Likelihood of Relatedness see also package minet
WGCNA
WeiGhted Correlation
Network Analsysis. It is based on a correlation measure. For
further details see the documentation of WGCNA
package. The method accept parameter P
which is set to
6 by default
bicor
Biweighted Correlation method. It uses a biweighted correlation as described in bicor package
TOM
Topological
Overlap Measure inference method. For further details see the
documentation of WGCNA package. As for WGCNA
the
parameter P
can be set(6 by default).
MINE
Maximum Informationbased Nonparametric
Exploration. This method uses the minerva implementation
of the original measure. For this methods different measures
are available. See minerva for further information. To
clarify the main MINE family statistics let D={(x,y)} be
the set of n ordered pairs of elements of x
and
y
. The data space is partitioned in an XbyY
grid, grouping the x and y values in X
and Y bins respectively.
The value of alpha
(default 0.6) has been empirically chosen by the authors of
the original paper.alpha is the exponent of the
searchgrid size B(n)=n^{α}. It is worthwhile
noting that alpha
and C
are defined to obtain an
heuristic approximation in a reasonable amount of time. In
case of small sample size (n) it is preferable to
increase alpha
to 1 to obtain a solution closer to the
theoretical one.
C
determines the number of starting
point of the XbyY searchgrid. When trying to partition the
xaxis into X columns, the algorithm will start with at most C
x X clumps. Default value is 15.
The Maximal
Information Coefficient (MIC) is defined as
MIC(D)=max_{XY<B(n)} M(D)_{X,Y}=max_{XY<B(n)} I*(D,X,Y)/log(min(X,Y)),
where B(n)=n^{α} is the searchgrid size, I*(D,X,Y) is the maximum mutual information over all grids XbyY, of the distribution induced by D on a grid having X and Y bins (where the probability mass on a cell of the grid is the fraction of points of D falling in that cell). The other statistics of the MINE family are derived from the mutual information matrix achieved by an XbyY grid on D. The Maximum Asymmetry Score (MAS) is defined as
MAS(D) = max_{XY<B(n)} M(D)_{X,Y}  M(D)_{Y,X}.
The Maximum Edge Value (MEV) is defined as
MEV(D) = max_{XY<B(n)} {M(D)_{X,Y}: X=2 or Y=2}.
The Minimum Cell Number (MCN) is defined as
MCN(D,ε) = min_{XY<B(n)} {log(XY): M(D)_{X,Y} >= (1ε)MIC(D)}.
More details are provided in the supplementary material (SOM) of the original paper.
MINEFDR
This calls an
FDR corrected version of the standard MINE method. See the
description for the MINE
method. Parameter
FDR=1e3
(default) can be set.
bicorFDR
This calls an FDR corrected version of
the bicor
method. See the description for the
bicor
. Parameter FDR=1e3
(default) can be
set.
WGCNAFDR
This calls an FDR corrected
version of the WGCNA
method. Parameter P
cannot
be set for this method. Parameter FDR=1e3
(default)
can be set.
DTWMIC
This method uses Dynamic Time Warping transformation coupled witht the MIC statistic from the MINE family. See Details for further information. Additional parameters can be set with this method:
tol
1e5 (default), a numeric value which
controls the tolerance on the variable variance. In particular
this parameter is passed to a function which controls the
variance of each feature. The function returns the indexes of
the features with variance <tol
. Indexes refers to
1based column numbers of the original dataset.
var.thr
1e5 (default), a numeric value which
controls the tolerance parameter on the column variance for the
method MINE, MINEFDR, DTWMIC
.
A P by P symmetric adjacency matrix with the diagonal set to 0. Self loop and direction of the edges are not taking into account. The values range in [0, 1].
Michele Filosi
Special thanks to:
Samantha Riccadonna, Giuseppe Jurman, Davide Albanese and Cesare
Furlanello
P. Langfelder, S. Horvath (2008) WGCNA: an R package for
weighted correlation network analysis. BMC Bioinformatics 2008,
9:559
P. E. Meyer, F. Lafitte, G. Bontempi (2008). MINET: An open source R/Bioconductor Package for Mutual Information based Network Inference. BMC Bioinformatics
http://www.biomedcentral.com/14712105/9/461
Jeremiah J Faith, Boris Hayete, Joshua T Thaden, Ilaria Mogno, Jamey Wierzbowski, Guillaume Cottarel, Simon Kasif, James J Collins, Timothy S Gardner. LargeScale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles
D. Albanese, M.Filosi, R. Visintainer, S. Riccadonna, G. Jurman, C. Furlanello (2013). minerva and minepy: a C engine for the MINE suite and its R, Python and MATLAB wrappers, Bioinformatics
M. Filosi, R. Visintainer, S. Riccadonna, G. Jurman, C. Furlanello (2014)Stability Indicators in Network Reconstruction, PLOSONE
D. Reshef, Y. Reshef, H. Finucane, S. Grossman, G. McVean, P.
Turnbaugh, E. Lander, M. Mitzenmacher, P. Sabeti. (2011)
Detecting novel associations in large datasets Science
(SOM: Supplementary Online Material at http://www.sciencemag.org/content/suppl/2011/12/14/334.6062.1518.DC1)
WGCNA
, minerva
, minet
, cor
1 2 3 4 5 6 7 
Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.