README.md
In sahatava/MixMPLN:

# MixMPLN "MixMPLN" is a package written in R, which has two features. First feature is generating a proper synthetic sample-taxa count matrix. Second feature is receiving a sample-taxa count matrix and extracting k(number of components) different interaction networks between taxa.

Following packages must be installed and loaded in R environment before using "MixMPLN":

library(mvtnorm)
library(Matrix)
library(matrixcalc)
library(corpcor)
library(cvTools)
library(huge)
library(edgeR)
library(factoextra)
library(igraph)
library(ROCR)

After installing the previous prerequisities, please run the following lines to install the "MixMPLN"

library("devtools")
install_github("sahatava/MixMPLN")
library("MixMPLN")

"MixMPLN" is able to generate the synthetic matrix which follows the real data features. If you are interested to apply the "MixMPLN" on your own data you could skip this part and jump to the next part (4-Extracting K interaction networks). Our generated synthetic data consists of K componenets. For each componenet, a random positive definite matrix with specific level of sparsity as a real precision matrix is generated. N generated samples for each componenet follows the MPLN distribution based on the mentioned precision matrix and a random mean vector. Finally, samples from different componets are combined together to generate multi component original count data. The generated original count data follows by multinomial distribution to mimic the sampling process.

Usage

out_generate = generate(K , N , d , sp, type)

Arguments

K-------> number of components
N-------> number of samples in each component
d-------> number of taxa for the synthetic sample-taxa count matrix
sp------> a value between 0 and 1 (sparsity level)
type----> data type: "orig" (no sampling applied) , "samp" (after sampling) , "TMM" (after normalization)

Values

out_generate$M -----------------> synthetic sample-taxa count matrix(rows represent samples and columns represent taxa)
out_generate$real_precision[[i]]-----> i_th real precision matrix

MixMPLN is a fuction which accept a sample-taxa matrix as an input to extract K different underlied interarction networks. Please save your data as matrix which its rows represents samples and its columns represents the taxa names. If you are interested in visualising the final graphs, you need to save the taxa names as colnames of this input matrix. K( number of components ) is another input for MixMPLN which should be specified by the user. For initialize the optimization using the Kmeans clustering please specified init="Kmeans" and rep=1. Otherwise, init="Random" and rep can be any integer number. As higher value for rep results in higher running time, we suggest to set it as 3 to trade off between the time and accuracy.

Usage

out_MixMPLN = MixMPLN(M , K , penalty , init , rep )

Arguments

M--------> name of the csv file which includes the sample-taxa matrix
K-----------> number of components 
penalty------> method to select tuning parameter: "no_sparsity" , "CV" (cross validation) , "StARS" , "fixed" , "iterative" 
init---------> initialization can be "Kmeans" or "Random"
rep----------> number of repeats when initialization is "Random"
out---------> the type of output for each componenet: precision matrix, partial correlation, lambda, mixing coefficiant, clustering membership 
threshold---> a value between 0 and 1. This threshold is used to generate adjacency matrix from partial correlation

Values

out_MixMPLN$precision[[i]]-----> i_th precision matrix(dXd)
out_MixMPLN$partial[[i]]-------> i_th partial correlation matrix(dXd)
out_MixMPLN$lambda[[i]]--------> i_th lambda matrix(NXd)
out_MixMPLN$pi-----------------> mixing coefficiant
out_MixMPLN$cluster------------> clustering membership

fviz_nbclust(M , MixMPLN ,penalty ,k.max, method = c("silhouette"))

Arguments

M-----------> sample-taxa count matrix
K.max-------> k=1 to k=k.max will be checked
penalty------> method to select tuning parameter: "no_sparsity" , "CV" (cross validation) , "StARS" , "fixed" , "iterative"

Values

Running the mentioned command will plot a diagram which shows optimum K

visualize(partial , threshold )

Arguments

partial--------> partial correlation matrix corresponding to one component
threshold------> Threshold to get the adjacency matrix of the graph

Values

The graphical plot for the taxa interaction network.

out_adj = partial2adj(partial, threshold)

Arguments

partial-------------> partial correlation matrix as a result of running the MixMPLN
threshold-----------> a value between 0 and 1

Values

out_adj-------------> 1 and -1 indicate edge existance, 0 indicates no edge

out = compare(out_generate$real_precision , out_MixMPLN$precision)

Arguments

ut_generate$real_precision -------> real partial correlation
out_MixMPLN$precision -------> estimated partial correlation

Values

out$frob------> forbenious norm of the difference
out$fr--------> relative difference
out$rms-------> rms difference
out$sn--------> sensitivity
out$sp--------> specificity
out$ROC-------> area under the ROC

out_generate = generate(K=2 , N=30 , d=30 , sp=0.8, type="orig")
out_MixMPLN = MixMPLN( out_generate$M , K=2 , penalty="CV" , init = "Random" , rep = 2)
out = compare(out_generate$real_precision  , out_MixMPLN$precision)

M=as.matrix(read.csv("real_data.csv",sep=",",header=TRUE,row.names=1,check.names=FALSE))
fviz_nbclust(M , MixMPLN ,"CV",k.max, method = c("silhouette"))
out_MixMPLN = MixMPLN( M , K=2 , penalty="CV", init = "Random" , rep = 2)
visualize(out_MixMPLN$partial[[1]] , threshold=0.3 )

The "MixMPLN" method is explained in following paper: Sahar Tavakoli and Shibu Yooseph. Learning a mixture of microbial networks using minorization-maximization. To appear in Intelligent Systems for Molecular Biology, 2019 Please note that in our implementation we have used the "huge" package in all the variations.

sahatava/MixMPLN documentation built on July 29, 2019, 8:55 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sahatava/MixMPLN

README.md
In sahatava/MixMPLN:

1-Prerequisites

2-Installation

3-Generating Synthetic data

Usage

Arguments

Values

4-Extracting K interaction networks

Usage

Arguments

Values

5-Auxiliary functions

silhouette method to find the optimal K

Arguments

Values

Graphical display

Arguments

Values

Converting the partial corelation to the adjacency matrix

Arguments

Values

Comparing the estimated precision matrix and the real one for synthetic data

Arguments

Values

6-Example

Example on synthetic data

Example on real data

Publication

R Package Documentation

Browse R Packages

We want your feedback!

sahatava/MixMPLN

README.md In sahatava/MixMPLN:

1-Prerequisites

2-Installation

3-Generating Synthetic data

Usage

Arguments

Values

4-Extracting K interaction networks

Usage

Arguments

Values

5-Auxiliary functions

silhouette method to find the optimal K

Arguments

Values

Graphical display

Arguments

Values

Converting the partial corelation to the adjacency matrix

Arguments

Values

Comparing the estimated precision matrix and the real one for synthetic data

Arguments

Values

6-Example

Example on synthetic data

Example on real data

Publication

R Package Documentation

Browse R Packages

We want your feedback!

README.md
In sahatava/MixMPLN: