draw.bubblePlot: Heat Bubble Matrix Plot for Top Drivers in NetBID2 Analysis

View source: R/pipeline_functions.R

draw.bubblePlotR Documentation

Heat Bubble Matrix Plot for Top Drivers in NetBID2 Analysis

Description

draw.bubblePlot combines the matrix bubble chart and the heat map, using bubble color to compare P-values (performed by Fisher's Exact Test) and bubble size to compare the intersected size for target genes. Rows are enriched gene set, columns are top drivers. Users can also check number of protein-coding genes targetted by each driver.

Usage

draw.bubblePlot(
  driver_list = NULL,
  show_label = driver_list,
  Z_val = NULL,
  driver_type = NULL,
  target_list = NULL,
  transfer2symbol2type = NULL,
  bg_list = NULL,
  min_gs_size = 5,
  max_gs_size = 500,
  gs2gene = NULL,
  use_gs = NULL,
  display_gs_list = NULL,
  Pv_adj = "none",
  Pv_thre = 0.1,
  top_geneset_number = 30,
  top_driver_number = 30,
  pdf_file = NULL,
  main = "",
  mark_gene = NULL,
  driver_cex = 1,
  gs_cex = 1,
  only_return_mat = FALSE
)

Arguments

driver_list

a vector of characters, the names of top drivers.

show_label

a vector of characters, the names of top drivers to be displayed in the plot. If NULL, the names in driver_list will be displayed. Default is NULL.

Z_val

a vector of numerics, the Z statistics of the driver_list. It is highly suggested to assign names to this vector. If the vector is nameless, the function will use the names of driver_list by default.

driver_type

a vector of characters, the biotype or other characteristics of the driver. In the demo, we use "gene_biotype" column in the master table as input. It is highly suggested to assign names to this vector. If the vector is nameless, the function will use the names of driver_list by default. Default is NULL.

target_list

list, the driver-to-target list object. The names of the list elements are drivers. Each element is a data frame, usually contains at least three columns. "target", target gene names; "MI", mutual information; "spearman", spearman correlation coefficient. Users can call get_net2target_list to create this list and follow the suggested pipeline.

transfer2symbol2type

data.frame, the ID-conversion table for converting the original ID into gene symbol and gene biotype (at gene level), or into transcript symbol and transcript biotype (at transcript level). It is highly suggested to use get_IDtransfer2symbol2type to create this ID-conversion table.

bg_list

a vector of characters, a vector of background gene symbols. If NULL, genes in gs2gene will be used as background. Default is NULL.

min_gs_size

numeric, the minimum size of gene set to analysis. Default is 5.

max_gs_size

numeric, the maximum size of gene set to analysis, Default is 500.

gs2gene

list, a list contains elements of gene sets. The name of the element is gene set, each element contains a vector of genes in that gene set. If NULL, will use all_gs2gene, which is created by function gs.preload. Default is NULL.

use_gs

a vector of characters, the names of gene sets. If gs2gene is NULL, all_gs2gene will be used. And the use_gs must be the subset of names(all_gs2gene). Please check all_gs2gene_info for detailed cateogory description. Default is c("H", "CP:BIOCARTA", "CP:REACTOME", "CP:KEGG").

display_gs_list

a vector of characters, the names of gene sets to be displayed in the plot. If NULL, all the gene sets will be displayed in descending order of their significance. Default is NULL.

Pv_adj

character, method to adjust P-value. Default is "none". For details, please check p.adjust.methods.

Pv_thre

numeric, threshold for the adjusted P-values. Default is 0.1.

top_geneset_number

integer, the number of top enriched gene sets to be displayed in the plot. Default is 30.

top_driver_number

integer, the number of top significant drivers to be displayed in the plot. Default is 30.

pdf_file

character, the file path to save as PDF file. If NULL, no PDF file will be save. Default is NULL.

main

character, an overall title for the plot.

mark_gene

a vector of characters, a vector of gene symbols to be highlighted red in the plot. Default is NULL.

driver_cex

numeric, giving the amount by which the text of driver symbols should be magnified relative to the default. Default is 1.

gs_cex

numeric, giving the amount by which the text of gene set names should be magnified relative to the default. Default is 1.

only_return_mat

logicial, if TRUE, the function will only return the gene set Vs. driver matrix with value representing the Z-statistics of the significance test; and the plot will not be generated. Default is FALSE.

Value

Return a logical value if only_return_mat=FALSE. If TRUE, the plot has been created successfully.

Examples

analysis.par <- list()
analysis.par$out.dir.DATA <- system.file('demo1','driver/DATA/',package = "NetBID2")
NetBID.loadRData(analysis.par=analysis.par,step='ms-tab')
ms_tab <- analysis.par$final_ms_tab
sig_driver <- draw.volcanoPlot(dat=ms_tab,label_col='gene_label',
                               logFC_col='logFC.G4.Vs.others_DA',
                               Pv_col='P.Value.G4.Vs.others_DA',
                               logFC_thre=0.4,
                               Pv_thre=1e-7,
                               main='Volcano Plot for G4.Vs.others_DA',
                               show_label=FALSE,
                               label_type = 'origin',
                               label_cex = 0.5)
gs.preload(use_spe='Homo sapiens',update=FALSE)
db.preload(use_level='gene',use_spe='human',update=FALSE)
use_genes <- base::unique(analysis.par$merge.network$network_dat$target.symbol)
transfer_tab <- get_IDtransfer2symbol2type(from_type = 'external_gene_name',
                                           use_genes=use_genes,
                                           dataset='hsapiens_gene_ensembl')
## get transfer table !!!
draw.bubblePlot(driver_list=rownames(sig_driver),
               show_label=ms_tab[rownames(sig_driver),'gene_label'],
               Z_val=ms_tab[rownames(sig_driver),'Z.G4.Vs.others_DA'],
               driver_type=ms_tab[rownames(sig_driver),'gene_biotype'],
               target_list=analysis.par$merge.network$target_list,
               transfer2symbol2type=transfer_tab,
               min_gs_size=5,
               max_gs_size=500,use_gs=c('H'),
               top_geneset_number=5,top_driver_number=5,
               main='Bubbleplot for top driver targets',
               gs_cex = 0.4,driver_cex = 0.5)
 ## the cex is set just in case of figure margin too large,
 ## in real case, user could set cex larger or input pdf file name
## Not run: 
analysis.par <- list()
analysis.par$out.dir.DATA <- system.file('demo1','driver/DATA/',package = "NetBID2")
NetBID.loadRData(analysis.par=analysis.par,step='ms-tab')
ms_tab <- analysis.par$final_ms_tab
sig_driver <- draw.volcanoPlot(dat=ms_tab,label_col='gene_label',
                               logFC_col='logFC.G4.Vs.others_DA',
                               Pv_col='P.Value.G4.Vs.others_DA',
                               logFC_thre=0.4,
                               Pv_thre=1e-7,
                               main='Volcano Plot for G4.Vs.others_DA',
                               show_label=FALSE,
                               label_type = 'origin',
                               label_cex = 0.5)
gs.preload(use_spe='Homo sapiens',update=FALSE)
use_genes <- base::unique(analysis.par$merge.network$network_dat$target.symbol)
transfer_tab <- get_IDtransfer2symbol2type(from_type = 'external_gene_name',
                                           use_genes=use_genes,
                                           dataset='hsapiens_gene_ensembl')
## get transfer table !!!
analysis.par$out.dir.PLOT <- getwd() ## directory for saving the pdf files
mark_gene <- c('KCNA1','EOMES','KHDRBS2','RBM24','UNC5D') ## marker for Group4
draw.bubblePlot(driver_list=rownames(sig_driver),
               show_label=ms_tab[rownames(sig_driver),'gene_label'],
               Z_val=ms_tab[rownames(sig_driver),'Z.G4.Vs.others_DA'],
               driver_type=ms_tab[rownames(sig_driver),'gene_biotype'],
               target_list=analysis.par$merge.network$target_list,
               transfer2symbol2type=transfer_tab,
               min_gs_size=5,max_gs_size=500,
               use_gs=use_gs=c('CP:KEGG','CP:BIOCARTA','H'),
               top_geneset_number=30,top_driver_number=50,
               pdf_file = sprintf('%s/bubbledraw.pdf',
               analysis.par$out.dir.PLOT),
               main='Bubbleplot for top driver targets',
               mark_gene=ms_tab[which(ms_tab$geneSymbol %in% mark_gene),
               'originalID_label'])

## End(Not run)

jyyulab/NetBID documentation built on Dec. 23, 2024, 6:34 a.m.