pathPlot: Variable values view in metabolic pathways

View source: R/drawing_pathway.R

pathPlotR Documentation

Variable values view in metabolic pathways

Description

Variable values view in KEGG metabolic pathways

Usage

pathPlot(
  gene.data = NULL,
  cpd.data = NULL,
  labels,
  varr.diff.list = NULL,
  threshold = NULL,
  thr.value = 0.05,
  FUN = median,
  species,
  pathway.id,
  kegg.native = TRUE,
  file.name = "path"
)

Arguments

gene.data

either vector (single sample) or a matrix-like data (multiple sample). Vector should be numeric with gene IDs as names or it may also be character of gene IDs. Character vector is treated as discrete or count data. Matrix-like data structure has genes as rows and samples as columns. Row names should be gene IDs. Here, gene ID is a generic concepts, including multiple types of gene, transcript and protein uniquely mappable to KEGG gene IDs. KEGG ortholog IDs are also treated as gene IDs as to handle metagenomic data. Check details for mappable ID types. Default gene.data=NULL. numeric, character, continuous

cpd.data

the same as gene.data, excpet named with IDs mappable to KEGG compound IDs. Over 20 types of IDs included in CHEMBL database can be used here. Check details for mappable ID types. Default cpd.data=NULL. Note that gene.data and cpd.data can't be NULL simultaneously.

labels

a vector of -1s, 0s, and 1s associating each sample with a phenotype. The value 0 corresponds to the first phenotype class of interest, 1 to the second phenotype class of interest, and -1 to the other classes, if there are more than two classes in the gene expression data.

varr.diff.list

an output dataframe from diffNetAnalysis function. Data frame structure has genes as rows and statistical test, Nominal p-value, Q-value (p-value FDR adjust for multiple tests)and networks measures, for each network, as columns. Row names should be gene IDs. Here gene ID is a generic concepts, including multiple types of gene, transcript and protein uniquely mappable to KEGG gene IDs.

threshold

a character indicating which column of "varr.diff.list" has to be used to filter which genes or coumponds will be drawn in metabolic map. The options are "pvalue" or "qvalue" to filter by Nominal p-value or Q-value (p-value FDR adjust for multiple tests), respectively. The default threshold=NULL, do not filter any row of data frame.

thr.value

a numeric value indicating the upper threshold value to filter data frame rows.

FUN

a function to define what value will be used in metabolic map.

species

character, either the kegg code, scientific name or the common name of the target species. This applies to both pathway and gene.data or cpd.data. When KEGG ortholog pathway is considered, species="ko". Default species="hsa", it is equivalent to use either "Homo sapiens" (scientific name) or "human" (common name).

pathway.id

character vector, the KEGG pathway ID(s), usually 5 digit, may also include the 3 letter KEGG species code.

kegg.native

logical, whether to render pathway graph as native KEGG graph (.png) or using graphviz layout engine (.pdf). Default kegg.native=TRUE.

file.name

character, the suffix to be added after the pathway name as part of the output graph file. Sample names or column names of the gene.data or cpd.data are also added when there are multiple samples. Default out.suffix="pathview".

Details

Pathview maps and renders user data on relevant pathway graphs. Pathview is a stand alone program for pathway based data integration and visualization. It also seamlessly integrates with pathway and functional analysis tools for large-scale and fully automated analysis. Pathview provides strong support for data Integration. It works with: 1) essentially all types of biological data mappable to pathways, 2) over 10 types of gene or protein IDs, and 20 types of compound or metabolite IDs, 3) pathways for over 2000 species as well as KEGG orthology, 4) varoius data attributes and formats, i.e. continuous/discrete data, matrices/vectors, single/multiple samples etc. To see mappable external gene/protein IDs do: data(gene.idtype.list), to see mappable external compound related IDs do: data(rn.list); names(rn.list). Pathview generates both native KEGG view and Graphviz views for pathways. Currently only KEGG pathways are implemented. Hopefully, pathways from Reactome, NCI and other databases will be supported in the future.

Value

From viersion 1.9.3, pathview can accept either a single pathway or multiple pathway ids. The result returned by pathview function is a named list corresponding to the input pathway ids. Each element (for each pathway itself is a named list, with 2 elements ("plot.data.gene", "plot.data.cpd"). Both elements are data.frame or NULL depends on the corresponding input data gene.data and cpd.data. These data.frames record the plot data for mapped gene or compound nodes: rows are mapped genes/compounds, columns are: kegg.names standard KEGG IDs/Names for mapped nodes. It's Entrez Gene ID or KEGG Compound Accessions. labels Node labels to be used when needed. all.mapped All molecule (gene or compound) IDs mapped to this node. type node type, currently 4 types are supported: "gene","enzyme", "compound" and "ortholog". x x coordinate in the original KEGG pathway graph. y y coordinate in the original KEGG pathway graph. width node width in the original KEGG pathway graph. height node height in the original KEGG pathway graph. other columns columns of the mapped gene/compound data and corresponding pseudo-color codes for individual samples The results returned by keggview.native and codekeggview.graph are both a list of graph plotting parameters. These are not intended to be used externally.

References

Luo, W. and Brouwer, C., Pathview: an R/Bioconductor package for pathway based data integration and visualization. Bioinformatics, 2013, 29(14): 1830-1831, doi: 10.1093/bioinformatics/btt285

Examples

set.seed(5)
expr <- as.data.frame(matrix(rnorm(120),40,30))
names(expr)<-c(4790, 4791, 4792, 4793, 84807, 4794, 4795, 64332, 595, 898, 23552, 1017, 8099,
 10263, 4609, 23077, 26292, 84073, 4610, 4613, 10408,  80177, 114897, 114898, 114899, 114900,
  114904, 114905, 390664, 338872)
labels <- rep(0:3,10)
adjacencyMatrix1 <- adjacencyMatrix(method="spearman", association="pvalue", threshold="fdr",
 thr.value=0.05, weighted=FALSE)
vertexCentrality <- degreeCentralityVertexTest(expr, labels, adjacencyMatrix1,numPermutations=1) #The numPermutations number is 1 to do a faster example, but we advise to use unless 1000 permutations in real analysis
vertexCentrality2<-cbind(c(4790, 4791, 4792, 4793, 84807, 4794, 4795, 64332, 595, 898, 23552,
 1017, 8099, 10263, 4609, 23077, 26292, 84073, 4610, 4613, 10408,  80177, 114897, 114898, 114899,
  114900, 114904, 114905, 390664, 338872),vertexCentrality)
pathPlot(gene.data=t(expr), cpd.data=NULL, labels=labels, varr.diff.list=vertexCentrality2,
 threshold=NULL, thr.value=1, FUN=median,species="hsa" , pathway.id="05200", kegg.native=TRUE,
  file.name="path")

jardimViniciusC/BioNetStat documentation built on July 3, 2022, 3:32 a.m.