Towards Confidence Estimates in Cascade Networks using the SelectBoost Package

#file.edit(normalizePath("~/.Renviron"))
LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE")
#LOCAL=TRUE
knitr::opts_chunk$set(purl = LOCAL)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

Aims of the vignette

Extending results from the Cascade package: reverse engineering with selectboost to compute confidence indices for a fitted model. We first fit a model to a cascade network using the Cascade package inference function then we compute confidence indices for the inferred links using the Selecboost algorithm.

If you are a Linux/Unix or a Macos user, you can install a version of SelectBoost with support for doMC from github with:

devtools::install_github("fbertran/SelectBoost", ref = "doMC")

References

Code

Code to reproduce the datasets saved with the package and some the figures of the article Bertrand et al. (2020), Bioinformatics. https://doi.org/10.1093/bioinformatics/btaa855

Data simulation

library(Cascade)

We define the F array for the simulations.

T<-4
F<-array(0,c(T-1,T-1,T*(T-1)/2))

for(i in 1:(T*(T-1)/2)){diag(F[,,i])<-1}
F[,,2]<-F[,,2]*0.2
F[2,1,2]<-1
F[3,2,2]<-1
F[,,4]<-F[,,2]*0.3
F[3,1,4]<-1
F[,,5]<-F[,,2]

We set the seed to make the results reproducible

set.seed(1)
Net<-Cascade::network_random(
  nb=100,
  time_label=rep(1:4,each=25),
  exp=1,
  init=1,
  regul=round(rexp(100,1))+1,
  min_expr=0.1,
  max_expr=2,
  casc.level=0.4
)
Net@F<-F

We simulate gene expression according to the network Net

M <- Cascade::gene_expr_simulation(
  network=Net,
  time_label=rep(1:4,each=25),
  subject=5,
  level_peak=200)

Network inference

We infer the new network.

Net_inf_C <- Cascade::inference(M,cv.subjects=TRUE)

Heatmap of the coefficients of the Omega matrix of the network. Run the code to get the graph.

stats::heatmap(Net_inf_C@network, Rowv = NA, Colv = NA, scale="none", revC=TRUE)
Fab_inf_C <- Net_inf_C@F

Compute the confidence indices for the inference

library(SelectBoost)
set.seed(1)

By default the crossvalidation is made subjectwise according to a leave one out scheme and the resampling analysis is made at the .95 c0 level. To pass CRAN tests, use.parallel = FALSE is required. Set use.parallel = TRUE and select the number of cores using ncores = 4.

net_pct_selected <- selectboost(M, Fab_inf_C, use.parallel = FALSE)
net_pct_selected_.5 <- selectboost(M, Fab_inf_C, c0value = .5, use.parallel = FALSE)
net_pct_selected_thr <- selectboost(M, Fab_inf_C, group = group_func_1, use.parallel = FALSE)

Use cv.subject=FALSE to use default crossvalidation

net_pct_selected_cv <- selectboost(M, Fab_inf_C, cv.subject=FALSE, use.parallel = FALSE)

Analysis of the confidence indices

Use plot to display the result of the confidence analysis.

plot(net_pct_selected)

Run the code to plot the other results.

plot(net_pct_selected_.5)
plot(net_pct_selected_thr)
plot(net_pct_selected_cv)

Run the code to plot the remaning graphs.

Distribution of non-zero (absolute value > 1e-5) coefficients

hist(Net_inf_C@network[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .95 resampling level versus coefficient value for non-zero (absolute value > 1e-5) coefficients

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected@network.confidence[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .5 resampling level versus coefficient value for non-zero (absolute value > 1e-5) coefficients

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected_.5@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected_.5@network.confidence[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .95 resamling level with groups created by thresholding the correlation matrix versus coefficient value for non-zero (absolute value > 1e-5) coefficients.

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected_thr@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected_thr@network.confidence[abs(Net_inf_C@network)>1e-5])

Plot of confidence at .95 resampling level versus coefficient value for non-zero (absolute value > 1e-5) coefficients using standard cross-validation.

plot(Net_inf_C@network[abs(Net_inf_C@network)>1e-5],net_pct_selected_cv@network.confidence[abs(Net_inf_C@network)>1e-5])
hist(net_pct_selected_cv@network.confidence[abs(Net_inf_C@network)>1e-5])

Further improvements

For further recommandations on the choice of the c0 parameter, for instance as a quantile, please consult the SI of Bertrand et al. 2020.



Try the SelectBoost package in your browser

Any scripts or data that you put into this service are public.

SelectBoost documentation built on Dec. 1, 2022, 1:27 a.m.