In CityUHK-CompBio/nemTar: A method for singlaing network inference from (epi)genetic aberration profiles

1 Introduction

Cancers are not single disease entities, but comprising multiple molecularly distinct subtypes, and the heterogeneity prevents precise selection of patients for optimized therapy. Dissecting cancer subtype-specific signaling pathways is crucial to pinpointing dysregulated genes for the prioritization of novel therapeutic targets. Nested effects models (NEMs) are a group of graphical models that encode subset relations between observed downstream effects under perturbations to upstream signaling genes, providing a prototype for mapping the inner workings of the cell. In this study, we developed NEM-Tar, which extends the original NEMs to predict drug targets by incorporating causal information of (epi)genetic aberrations for signaling pathway inference. An information theory-based score, weighted information gain (WIG), was proposed to assess the impact of signaling genes on a specific downstream biological process. We will show in detail how to implement a toy example for singaling network inference as well as a real case study for prioritizing the potential therapeutic targets.

2 A brief overview of NEMs and NEM-Tar

knitr::include_graphics("E:/Thesis/NEM_tar_Fig/Figure 2_0915.png")

Figure 1. Comparison between the structures of (A) classic NEM and (B) our proposed NEM-Tar

As shown in Figure 1A, in classic nested effects models, the S-genes are modeled as hidden variables, and their signaling interaction graph G (solid arrows) is the target to infer. In experiments with perturbations to individual S-genes, differential expression of downstream genes could be observed and considered as effect reporter genes (E-genes). Assuming that each E-gene is directly regulated by at most one S-gene in G, the maximum a posteriori attachment Θ (dashed arrows) of effect genes to S-genes could be computed. The goal is to search for the signaling graph G, which yields the most likely probabilistic nested effects. Illustrated in Figure 1B, an extra observational dimension (the real patients) is the factor that NEM-Tar should deal with.The necessary adjustment should be conducted on the design and inference strategies of classic NEM. However, the information needs to infer is also the hidden interaction between S-genes and the attachment relationship of E-genes to S-genes.And the likelihood function of NEM-Tar is similar to that of NEMs, except the state matrix of regulators (S-genes) S* in our model.

3 Network inference on a toy example

3.1 Introduction of the in-silico data

The applicaitons on real case studies of NEM-Tar require a lot of data preprocssing and integrative selection of singaling genes(S-genes) and effect reporter genes(E-genes).For the purpose of interpreting the main work flow and contribution of NEM-Tar.At first, we will introduce the employed in-silico data.

library(nemTar)
library(dplyr)
data("example")

The toy example contains four different elements.The S-gene number was applied in medium size(12), thus the dimension of Sgene_hidden, storing the unobserved S-gene states was 12\times12; Edata is the E-gene profiles with the dimension 804\times100, S_obs is the observed profile of S-gene states 'after' perturbations,with the dimension of 100\times12.

dim(Edata)
dim(S_obs)
dim(Sgene_hidden)
para

3.2 Network Inference Using Greedy hill-climbing

control<-nem::set.default.parameters(Sgenes=rownames(Sgene_hidden),type="mLL",para=para)
nemTar_rslt<-nem_Tar_greedy(Edata,Sgenes=control$Sgenes,S_obs,control=control)

This process takes a few seconds.

3.3 Visulization of the inferred S-gene network

To visualize the inference results, an R package RedeR is suggested.However, the function plot.nem could give a more quickly visulization.The error matrix is the difference between the inferred S-gene matrix and the generated S-gene matrix, the result that all the entries is 0 indicates that the inference is perfect.

plot.nem(nemTar_rslt,what="graph")
plot.nem(nemTar_rslt,what="graph",transitiveReduction=T)
adj<-as(nemTar_rslt$graph, "matrix")
error<-Sgene_hidden-adj
print(error)

4 Case study I-Inferring the Signaling Network Driving the EMT Subtype of Gastric Cancer and Prioritization of Potential Drug Targets

4.1 Introduction of the input profiles of GC

The profiles of GC contains three different elements.The S-gene number was determined as 14, thus the dimension of adjacency matrix of S-gene interacction network was 15\times15; Edata_GC_ori and D are the E-gene profiles before and after the transformation to binary variable,with the dimension of 1194\times38 and 824\times28, respectively; Sdata_GC is the observed profile of 'natural' perturbation states of Sgenes with the dimension of 28\times15.

# load the input profiles of GC
library(nemTar)
data("case_GC")
data("EMT_list")

4.2 Transforming the E-gene profiles into binary variable and network inference

res.disc <- nem.discretize(Edata_GC_ori,neg.control=1:8,pos.control=9:10,nfold=2,cutoff= 0.5)
D<-res.disc$dat
para<-res.disc$para
control<-set.default.parameters(Sgenes=Sgenes_GC,type="mLL",para=para)
nemTar_GC<-nem_Tar_greedy(D=D,Sgenes=Sgenes_GC,S_pattern=Sdata_GC,control=control)

4.3 Visulization of the inferred S-gene network

4.3.1 Quick visulization

plot.nem(nemTar_GC,what="graph")
plot.nem(nemTar_GC,what="graph",transitiveReduction=T)

4.3.2 Employing the 'RedeR' package to visualize the network(after adjustment of the layout,color etc.)

knitr::include_graphics("E:/NEM_paper2/GC_EMT_net.png")

4.4 Assessment of the influence of S-gene perturbations on EMT process

The statistical significance of the causal effect that the S-genes exert on downstream pathways(EMT pathway here) is quantified by permutation tests, i.e.,random sampling of E-genes with the same number of EMT signature genes in the regulon of a S-gene, and calculating the frequency of observing a same or higher WIG from the sampled E-gene sequences.(The value 0 of the adjusted P stands for the corresponding P value is less than the resolution of current sampling times,in the following case P<1e-05.)

EMT_post<-path_post(nemTar_GC,EMT_list,Sgenes_GC)
WIG<-compute_WIG(EMT_post$post_affected,EMT_post$path_affected,15)
sample_WIG0<-WIG_sample(EMT_post,nemTar_GC,EMT_list,15,1e05)
sig_test<-WIGsig_test(sample_WIG0,WIG,1e05)
### results summary
sig_rslt<-data.frame(S_genes=names(EMT_post$Sig_affected)[which(lengths(EMT_post$path_affected)!=0)],
                     WIG=round(WIG$WIG,2),adjusted_P=format(round(sig_test,6),
                                                            scientific = TRUE,digits = 3))
colnames(sig_rslt)<-c("S genes","WIG","Adjusted P")
sig_rslt<-sig_rslt[order(sig_rslt$WIG,decreasing=TRUE),]
knitr::kable(sig_rslt,align="l",row.names=F,caption="Assessment of single S-gene perturbation  
             on EMT in GC")

Also, the weigheted information gain(WIG) of combinational perturbation of 2 S-genes could be calcualted.The kinases of double perturbation with higher WIGs could be the candidate combinational therapeutic targets.

Sgene_double_WIG<-WIG_double(EMT_post,15)
### results summary
sig_rslt<-data.frame(S_genes=names(Sgene_double_WIG$WIG),
                     WIG=round(Sgene_double_WIG$WIG,2))
colnames(sig_rslt)<-c("S genes","WIG")
sig_rslt<-sig_rslt[order(sig_rslt$WIG,decreasing=TRUE),]
knitr::kable(sig_rslt[match(c("CDH1/ERBB2","KRAS/CDH1","BRAF/CDH1","KRAS/ERBB2",
                              "BRAF/ERBB2","KRAS/BRAF"),rownames(sig_rslt)),],
             align="l",row.names=F,caption="Assessment of double perturbations 
             (kinase only) on EMT in GC")

5 Case study II-Inferring the Signaling Network Driving the CMS4-mesenchymal Subtype of Colorectal Cancer(CRC) and Prioritization of Potential Drug Targets

5.1 Introduction of the input profiles of CRC

The profiles of GC contains three different elements.The S-gene number was prioritized as 15, thus the dimension of adjacency matrix of S-gene interacction network was 13\times13; Edata_GC_ori and D are the E-gene profiles before and after the transformation to binary variable,with the dimension of 1337\times96 and 1323\times50,respectively; Sdata_CRC is the observed profile of 'natural' perturbation states of S-genes with the dimension 50\times13.

library(nemTar)
data("case_CRC")
data("EMT_list")

5.2 Transforming the E-gene profiles into binary variable and network inference

res.disc <- nem.discretize(Edata_CRC_ori,neg.control=1:30,pos.control=31:46,nfold=2,cutoff=0.6)
D<-res.disc$dat
para<-res.disc$para
control<-set.default.parameters(Sgenes=Sgenes_CRC,type="mLL",para=para)
nemTar_CRC<-nem_Tar_greedy(D,Sgenes=control$Sgenes,Sdata_CRC,control=control)

5.3 Visulization of the inferred S-gene network

5.3.1 Quick visulization

plot.nem(nemTar_CRC,what="graph")
plot.nem(nemTar_CRC,what="graph",transitiveReduction=T)

5.3.2 Employing the 'RedeR' package to visualize the network(after adjustment of the layout,color etc.)

knitr::include_graphics("E:/NEM_paper2/CRC_CMS4_net.png")

5.4 Assessment of the influence of S-gene perturbations on EMT process

EMT_post<-path_post(nemTar_CRC,EMT_list,Sgenes_CRC)
WIG<-compute_WIG(EMT_post$post_affected,EMT_post$path_affected,13)
sample_WIG0<-WIG_sample(EMT_post,nemTar_CRC,EMT_list,13,1e05)
sig_test<-WIGsig_test(sample_WIG0,WIG,1e05)
### results summary
sig_rslt<-data.frame(S_genes=names(EMT_post$Sig_affected)[which(lengths(EMT_post$path_affected)!=0)],
                     WIG=round(WIG$WIG,2),adjusted_P=format(round(sig_test,6),
                                                            scientific = TRUE,digits = 3))
colnames(sig_rslt)<-c("Sgenes","WIG","Adjusted P")
# sig_rslt[which(sig_rslt[3]==0),3]<-"<1e-05"
sig_rslt<-sig_rslt[order(sig_rslt$WIG,decreasing=TRUE),]
knitr::kable(sig_rslt,align="l",row.names=F,caption="Assessment of the impact of single S-gene 
             perturbation on EMT in CRC")

Similar to study in GC, the weigheted information gain(WIG) of combinational perturbation of 2 Sgenes could be calcualted.The kinases of double perturbation with higher WIGs could be the candidate combinational therapeutic targets.

Sgene_double_WIG<-WIG_double(EMT_post,13)
### results summary
sig_rslt<-data.frame(S_genes=names(Sgene_double_WIG$WIG),
                     WIG=round(Sgene_double_WIG$WIG,2))
colnames(sig_rslt)<-c("S genes","WIG")
sig_rslt<-sig_rslt[order(sig_rslt$WIG,decreasing=TRUE),]
knitr::kable(sig_rslt[match(c("KRAS/CTNNB1","KRAS/TGFBR2","KRAS/ERBB4","KRAS/BRAF",
                              "KRAS/PIK3CA","BRAF/CTNNB1","PIK3CA/CTNNB1","TGFBR2/CTNNB1",
                              "ERBB4/CTNNB1","TGFBR2/ERBB4"),rownames(sig_rslt)),],
             align="l",row.names=F,caption="Assessment of double perturbations 
             (kinase only) on EMT in CRC")

CityUHK-CompBio/nemTar documentation built on March 11, 2021, 6:03 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

CityUHK-CompBio/nemTar
A method for singlaing network inference from (epi)genetic aberration profiles

In CityUHK-CompBio/nemTar: A method for singlaing network inference from (epi)genetic aberration profiles

1 Introduction

2 A brief overview of NEMs and NEM-Tar

3 Network inference on a toy example

3.1 Introduction of the in-silico data

3.2 Network Inference Using Greedy hill-climbing

3.3 Visulization of the inferred S-gene network

4 Case study I-Inferring the Signaling Network Driving the EMT Subtype of Gastric Cancer and Prioritization of Potential Drug Targets

4.1 Introduction of the input profiles of GC

4.2 Transforming the E-gene profiles into binary variable and network inference

4.3 Visulization of the inferred S-gene network

4.3.1 Quick visulization

4.3.2 Employing the 'RedeR' package to visualize the network(after adjustment of the layout,color etc.)

4.4 Assessment of the influence of S-gene perturbations on EMT process

5 Case study II-Inferring the Signaling Network Driving the CMS4-mesenchymal Subtype of Colorectal Cancer(CRC) and Prioritization of Potential Drug Targets

5.1 Introduction of the input profiles of CRC

5.2 Transforming the E-gene profiles into binary variable and network inference

5.3 Visulization of the inferred S-gene network

5.3.1 Quick visulization

5.3.2 Employing the 'RedeR' package to visualize the network(after adjustment of the layout,color etc.)

5.4 Assessment of the influence of S-gene perturbations on EMT process

R Package Documentation

Browse R Packages

We want your feedback!

CityUHK-CompBio/nemTar A method for singlaing network inference from (epi)genetic aberration profiles

In CityUHK-CompBio/nemTar: A method for singlaing network inference from (epi)genetic aberration profiles

1 Introduction

2 A brief overview of NEMs and NEM-Tar

3 Network inference on a toy example

3.1 Introduction of the in-silico data

3.2 Network Inference Using Greedy hill-climbing

3.3 Visulization of the inferred S-gene network

4 Case study I-Inferring the Signaling Network Driving the EMT Subtype of Gastric Cancer and Prioritization of Potential Drug Targets

4.1 Introduction of the input profiles of GC

4.2 Transforming the E-gene profiles into binary variable and network inference

4.3 Visulization of the inferred S-gene network

4.3.1 Quick visulization

4.3.2 Employing the 'RedeR' package to visualize the network(after adjustment of the layout,color etc.)

4.4 Assessment of the influence of S-gene perturbations on EMT process

5 Case study II-Inferring the Signaling Network Driving the CMS4-mesenchymal Subtype of Colorectal Cancer(CRC) and Prioritization of Potential Drug Targets

5.1 Introduction of the input profiles of CRC

5.2 Transforming the E-gene profiles into binary variable and network inference

5.3 Visulization of the inferred S-gene network

5.3.1 Quick visulization

5.3.2 Employing the 'RedeR' package to visualize the network(after adjustment of the layout,color etc.)

5.4 Assessment of the influence of S-gene perturbations on EMT process

R Package Documentation

Browse R Packages

We want your feedback!

CityUHK-CompBio/nemTar
A method for singlaing network inference from (epi)genetic aberration profiles