Working with SpidermiR package"

knitr::opts_chunk$set(dpi = 300)
knitr::opts_chunk$set(cache=FALSE)
devtools::load_all()

Introduction

Biological systems are composed of multiple layers of dynamic interaction networks. These networks can be decomposed, for example, into: co-expression, physical, co-localization, genetic, pathway, and shared protein domains.

GeneMania provides us with an enormous collection of data sets for interaction network studies [@ref1]. The data can be accessed and downloaded from different database, using a web portal. But currently, there is not a R-package to query and download these data.

An important regulatory mechanism of these network data involves microRNAs (miRNAs). miRNAs are involved in various cellular functions, such as differentiation, proliferation, and tumourigenesis. However, our understanding of the processes regulated by miRNAs is currently limited and the integration of miRNA data in these networks provides a comprehensive genome-scale analysis of miRNA regulatory networks.Actually, GeneMania doesn't integrate the information of miRNAs and their interactions in the network.

SpidermiR allows the user to query, prepare, download network data (e.g. from GeneMania), and to integrate this information with miRNA data with the possibility to analyze these downloaded data directly in one single R package. This techincal report gives a short overview of the essential SpidermiR methods and their application.

Installation

To install use the code below.

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install("SpidermiR")

SpidermiRquery: Searching network

You can easily search GeneMania data using the SpidermiRquery function.

SpidermiRquery_species: Searching by species

The user can query the species supported by GeneMania, using the function SpidermiRquery_species:

org<-SpidermiRquery_species(species)

The list of species is shown below:

knitr::kable(org, digits = 2,
             caption = "List of species",row.names = TRUE)

SpidermiRquery_networks_type: Searching by network categories

The user can query the network types supported by GeneMania for a specific specie, using the function SpidermiRquery_networks_type. The user can select a specific specie using an index obtained by the function SpidermiRquery_species (e.g. organismID=org[6,] is the input for Homo_sapiens,organismID=org[9,] is the input for Saccharomyces cerevisiae )

net_type<-SpidermiRquery_networks_type(organismID=org[9,])

The list of network categories in Saccharomyces cerevisiae is shown below:

net_type

SpidermiRquery_spec_networks: Searching by species, and network categories

You can filter the search by species using organism ID (above reported), and the network category. The network category can be filtered using the following parameters:

net_shar_prot<-SpidermiRquery_spec_networks(organismID = org[9,],
                                    network = "SHpd")

The databases, which data are collected, are the output of this step. An example is shown below ( for Shared protein domains in Saccharomyces_cerevisiae data are collected in INTERPRO, and PFAM):

net_shar_prot

SpidermiRdownload: Downloading network data

The user in this step can download the data, as previously queried.

SpidermiRdownload_net: Download network

The user can download the data (previously queried) with SpidermiRdownload_net.

out_net<-SpidermiRdownload_net(net_shar_prot)

The list of SpidermiRdownload_net is shown below:

str(out_net)

SpidermiRdownload_miRNAprediction: Downloading miRNA predicted data target

The user can download the predicted miRNA-gene from 4 databases:DIANA, Miranda, PicTar and TargetScan using miRNAtap [@ref9].

mirna<-c('hsa-miR-567','hsa-miR-566')
SpidermiRdownload_miRNAprediction(mirna_list=mirna)

SpidermiRdownload_miRNAvalidate: Downloading miRNA validated data target

The user can download the validated miRNA-gene from: miRTAR and miRwalk [@ref2] [@ref3].

list<-SpidermiRdownload_miRNAvalidate(validated)

SpidermiRdownload_miRNAextra_cir:Download Extracellular Circulating microRNAs

The user can download extracellular circulating miRNAs from miRandola database

list_circ<-SpidermiRdownload_miRNAextra_cir(miRNAextra_cir)

SpidermiRdownload_pharmacomir: Download Pharmaco-miR Verified Sets

The user can download Pharmaco-miR predicted Sets from a fisher test. This function will give an information about the significant interactions between the miRNAs and the drug of interest (in this case tamoxifen). p-value and FDR will be visualized. The fisher test will be calculated considering the interactions drug and gene derived by DGIdb and MATADOR database [@ref7] [@ref8].

list<-SpidermiRdownload_miRNAvalidate(validated)
drug="TAMOXIFEN"
drug_genetarget<-SpidermiRdownload_pharmacomir(list[1:100,],drug="TAMOXIFEN")

The output generated by SpidermiRdownload_pharmacomir will give a table with 6 columns: in the first column miRNAs associated with the drug of interest, in the second column the number of drug targets, the number of miRNA targets, the common genes between drug targets and miRNA targets, p-value and FDR.

knitr::kable(drug_genetarget, digits = 2,
             caption = "miRNA associated with Tamoxifen",row.names = TRUE)

The list of drugs can be chosen between the drugs generated by the function SpidermiRdownload_drug_gene:

drug_genetarget<-SpidermiRdownload_drug_gene(drug_gene)

SpidermiRprepare: Preparing the data

SpidermiRprepare_NET: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols

SpidermiRprepare_NET reads network data from SpidermiRdownload_net and enables user to prepare them for downstream analysis. In particular, it prepares matrix of gene network mapping Ensembl Gene ID to gene symbols. Gene symbols are needed to integrate miRNAdata.

geneSymb_net<-SpidermiRprepare_NET(organismID = org[9,],
                                    data = out_net)

The network with gene symbols ID is shown below:

knitr::kable(geneSymb_net[[1]][1:5,c(1,2,3,5,8)], digits = 2,
             caption = "shared protein domain",row.names = FALSE)

SpidermiRanalyze: : Analyze data from network data

SpidermiRanalyze_direct_net: Searching by biomarkers of interest with direct interaction

Starting from a set of biomarkers of interest (BI), genes, miRNA or both, given by the user, this function finds sub-networks including all direct interactions involving at least one of the BI.

biomark_of_interest<-c("hsa-let-7a","CDC34","hsa-miR-27a","PEX7","EPT1","FOX","hsa-miR-5a")
miRNA_NET <-data.frame(V1=c('hsa-let-7a','CASP3','BRCA','hsa-miR-7a','hsa-miR-5a','SMAD','SOX'),V2=c('CASP3','TAMOXIFEN','MYC','PTEN','FOX','HIF1','P53'),stringsAsFactors=FALSE)
GIdirect_net<-SpidermiRanalyze_direct_net(data=miRNA_NET,BI=biomark_of_interest)

The data frame of SpidermiRanalyze_direct_net, GIdirect_net, is shown below:

str(GIdirect_net)

SpidermiRanalyze_direct_subnetwork: Network composed by only the nodes in a set of biomarkers of interest

Starting from BI, this function finds sub-networks including all direct interactions involving only BI.

subnet<-SpidermiRanalyze_direct_subnetwork(data=miRNA_NET,BI=biomark_of_interest)

SpidermiRanalyze_subnetwork_neigh: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.

Starting from BI, this function finds sub-networks including all direct and indirect interactions involving at least one of BI.

GIdirect_net_neigh<-SpidermiRanalyze_subnetwork_neigh(data=miRNA_NET,BI=biomark_of_interest)

SpidermiRanalyze_degree_centrality: Ranking degree centrality genes

This function finds the number of direct neighbours of a node in a network and allows the selection of those nodes with a number of direct neighbours higher than a selected cut-off.

top10_cent_gene<-SpidermiRanalyze_degree_centrality(miRNA_NET)

SpidermiRanalyze_Community_detection: Find community detection

This function find the communities in the network, and describes them in terms of number of community elements (both genes and miRNAs). The function uses one of the algorithms currently implemented in [@ref5], selected by the user according to the user need.

The user can choose the algorithm in order to calculate the community structure:

comm<-  SpidermiRanalyze_Community_detection(data=miRNA_NET,type="FC")

SpidermiRanalyze_Community_detection_net: Community detection

Starting from one community to which some BI belong (the output of the previously described function) this function describes the community as network of elements (both genes and miRNAs).

cd_net<-SpidermiRanalyze_Community_detection_net(data=miRNA_NET,comm_det=comm,size=1)

SpidermiRanalyze_Community_detection_bi: Community detection from a set of biomarkers of interest

Starting from the community to which BI belong (the output of the previously described function), this function indicates if a set of BI is included within such community.

gi=c("P53","PTEN","KIT","CCND2")
mol<-SpidermiRanalyze_Community_detection_bi(data=comm,BI=gi)

SpidermiRvisualize: To visualize the network

SpidermiRvisualize_plot_target: Visualize the plot with miRNAs and the number of their targets in the network.

For each BI of a community, the user can visualize a plot showing the number of direct neighbours of such BI (the degree centrality of such BI).

SpidermiRvisualize_plot_target(data=miRNA_NET)

SpidermiRvisualize_degree_dist: plots the degree distribution of the network

This function plots the cumulative frequency distribution of degree centrality of a community.

SpidermiRvisualize_degree_dist(data=miRNA_NET)

SpidermiRvisualize_adj_matrix: plots the adjacency matrix of the network

It plots the adjacency matrix of the community, representing the degree of connections among the nodes.

SpidermiRvisualize_adj_matrix(data=miRNA_NET)

SpidermiRvisualize_3Dbarplot: 3D barplot

It plots a summary representation of the networks with the number of edges, nodes and miRNAs.

SpidermiRvisualize_3Dbarplot(Edges_1net=1041003,Edges_2net=100016,Edges_3net=3008,Edges_4net=1493,Edges_5net=1598,NODES_1net=16502,NODES_2net=13338,NODES_3net=1429,NODES_4net=675,NODES_5net=712,nmiRNAs_1net=0,nmiRNAs_2net=74,nmiRNAs_3net=0,nmiRNAs_4net=0,nmiRNAs_5net=37)

Features databases SpidermiR:

Features of databases integrated in SpidermiR are:

B<-matrix( c("Gene network", "Validated miRNA-target","", "Predicted miRNA-target","","","", "Extracellular Circulating microRNAs", "Pharmaco-miR","",

             "GeneMania", "miRwalk","miRTarBase", "DIANA", "Miranda", "PicTar","TargetScan","miRandola","DGIdb","MATADOR",

             "Current","miRwalk2","miRTarBase 7","DIANA- 5.0","N/A","N/A","TargetScan7.1","miRandola v 02/2017","N/A","N/A",
             2017,2015,2017,2013,2010,"N/A","2016",2017,2018,"N/A",
             "http://genemania.org/data/current/","http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk2/downloads/vtm/hsa-vtm-gene.rdata.zip","http://mirtarbase.mbc.nctu.edu.tw/cache/download/7.0/miRTarBase_SE_WR.xls","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","http://mirandola.iit.cnr.it/download/miRandola_version_02_2017.txt","http://dgidb.org/data/interactions.tsv", "http://matador.embl.de/media/download/matador.tsv.gz"

             ), nrow=10, ncol=5)
colnames(B)<-c("CATEGORY","EXTERNAL DATABASE","VERSION","LAST UPDATE","LINK")
knitr::kable(B, digits = 2,
             caption = "Features",row.names = FALSE)

Session Information


sessionInfo()

References



Try the SpidermiR package in your browser

Any scripts or data that you put into this service are public.

SpidermiR documentation built on Nov. 8, 2020, 8:25 p.m.