knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
R-package implementation for the method presented in the paper "Information Enhanced Model Selection for Gaussian Graphical Model with Application to Metabolomic Data" by Zhou et al. 2020.
Abstract
In light of the low signal-to-noise nature of many large biological data sets, we propose a novel method to learn the structure of association networks using a Gaussian graphical model combined with prior knowledge. Our strategy includes two parts. In the rst part, we propose a model selection criterion called structural Bayesian information criterion (SBIC), in which the prior structure is modeled and incorporated into the Bayesian information criterion (BIC). It is shown that the popular extended BIC (EBIC) is a special case of SBIC. In second part, we propose a two-step algorithm to construct the candidate model pool. The algorithm is data-driven and the prior structure is embedded into the candidate model automatically. Theoretical investigation shows that under some mild conditions SBIC is a consistent model selection criterion
for the high-dimensional Gaussian graphical model. Simulation studies validate the superiority of
the proposed algorithm over the existing ones and show the robustness to the model misspecification.
Application to relative concentration data from infant feces collected from subjects enrolled
in a large molecular epidemiological cohort study validates that prior knowledge on metabolic
pathway involvement is a statistically signicant factor for the conditional dependence among
metabolites. Furthermore, new relationships among metabolites are identied through the proposed
algorithm which have not been discovered by conventional pathway analysis. Some of them
have been widely recognized in the literature.
The development version from GitHub with:
# install.packages("remotes") remotes::install_github("hoenlab/SBIC")
This is a basic example which shows you how to solve a common problem:
library(SBICgraph) library(network) # for visualization # simulate data using the incorporated simulate function p <- 200 m1 <- 100 m2 <- 30 d <- simulate(n=100, p=p, m1=m1, m2=m2) data<- d$data real<- d$realnetwork priori<- d$priornetwork
Compare between We can visualize the networks
prior_net <- network(priori) real_net <- network(real) par(mfrow = c(1,2)) plot(prior_net, main = "Prior network") plot(real_net, main = "Real network")
Then we can fit SBIC using one function
lambda<- exp(seq(-10,10, length=30)) # calculating the error rate from the number of edges in the true graph and the number of discordant pairs r1 <- m2/m1 r2 <-m2/(p*(p-1)/2-m1) r <- (r1+r2)/2 model<- sggm(data = data, lambda = lambda, M=priori, prob = r)
Comparing the estimated network to the true and prior network. Our comparison function above calcualtes the Positive selection rate (PSR) and the False positive rate (FDR)
print("Comparing estimated model with the real network") comparison(real = real, estimate = model$networkhat) print("Comparing the prior network with the real network") comparison(real = real, estimate = priori)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.