library(tidyverse) library(flextable) knitr::opts_chunk$set(fig.width=4.25, fig.height=3.5, fig.retina=3, message=FALSE, warning=FALSE, cache = TRUE, autodep = TRUE, hiline=TRUE) knitr::opts_hooks$set(fig.callout = function(options) { if (options$fig.callout) { options$echo <- FALSE options$out.height <- "99%" options$fig.width <- 16 options$fig.height <- 8 } options }) hook_source <- knitr::knit_hooks$get('source') knitr::knit_hooks$set(source = function(x, options) { if (!is.null(options$hiline) && options$hiline) { x <- stringr::str_replace(x, "^ ?(.+)\\s?#<<", "*\\1") } hook_source(x, options) }) options(htmltools.dir.version = FALSE, width = 90) as_table <- function(...) knitr::kable(..., format='html', digits = 3)
background-image: url("../inst/images/fgczgseaora.png")
.img-right[
]
Pathway analysis uses a priori gene sets that have been grouped together by their involvement in the same biological pathway, or by proximal location on a chromosome. Examples of gene set database are Gene Ontology (GO), KEGG, Reactome and many more.
tab <- matrix(c(12, 3, 7, 24), nrow = 2, byrow = TRUE) dimnames(tab) <- list("GO Term" = c("Contained", "Not Contained"), "Differentially expressed" = c(" Yes ", " No ")) cat("Pathway GO:0003091") tab cat("p-value:", round(fisher.test(tab)$p.value, 5))
.img-right[
]
.footnote[Gene Sets can be highly correlated, because they contain the same proteins. Multiplicity adjustment assumes indpendence (FDR).]
.img-right[
]
benchmark <- list( c("WebGestaltR","CRAN","+","-","+","+","+"), c("FGNet","Bioc","+","(-)","(-)","-","+"), c("HTSanalyzeR","Bioc","-","(-)","-","+","+"), c("sigora","CRAN","+","+","(-)","+","-"), c("SetRank","CRAN","-","(-)","-","-","+"), c("STRINGdb","Bioc","+","-","(-)","+","+"), c("enrichR","CRAN","+","-","+","(+)","+"), c("TopGO", "Bioc","...","","","","")) benchmark <- do.call("rbind",benchmark) colnames(benchmark) <- c("Package","Repo","Maintenance","offline","ID Mapping","ORA","GSEA") flextable(data.frame(benchmark)) %>% set_caption("R packages for pathway analysis")
WebgestaltR
(online only)sigORA
(offline).footnote[WebgesaltR - Various gene set databases, id mapping, allows for downloading html results. sigORA - uses gene pair signatures. Searches background and pathways for protein pairs unique to a given pathway. By this it decreases the correlation among gene sets.]
.left-code[
runWebGestaltGSEA( data = dd, fpath = "", ID_col = "UniprotID", score_col = "estimate", organism = "hsapiens", target = "geneontology_Biological_Process", nperm = 500, outdir = file.path(odir, "WebGestaltGSEA") )
] .right-code[
runWebGestaltORA( data = dd, fpath = "", ID_col = "UniprotID", score_col = "estimate", organism = "hsapiens", threshold = 1, greater = TRUE, target = "geneontology_Biological_Process", nperm = 500, outdir = file.path(odir, "WebGestaltORA") ) runSIGORA( data = dd, score_col = "estimate", threshold = 1, greater = TRUE, target = "GO", outdir = file.path(odir, "sigORA") )
]
```{bash eval=FALSE} Rscript lfq_multigroup_gsea.R ./foldchange_estimates.xlsx -o hsapiens Rscript lfq_multigroup_ora.R ./foldchange_estimates.xlsx -t uniprotswissprot
The enrichment methods in this package (ORA, GSEA sigORA) come with a `docopt` based command line tool to facilitate analysing batches of files. --- # Command line interface ```r "WebGestaltR GSEA for multigroup reports Usage: lfq_multigroup_gsea.R <grp2file> [--organism=<organism>] [--outdir=<outdir>] [--idtype=<idtype>] [--ID_col=<ID_col>] [--nperm=<nperm>] [--score_col=<score_col>] [--contrast=<contrast>] Options: -o --organism=<organism> organism [default: hsapiens] -r --outdir=<outdir> output directory [default: results_gsea] -t --idtype=<idtype> type of id used for mapping [default: uniprotswissprot] -i --ID_col=<ID_col> Column containing the UniprotIDs [default: UniprotID] -n --nperm=<nperm> number of permutations to calculate enrichment scores [default: 500] -e --score_col=<score_col> column containing fold changes [default: pseudo_estimate] -c --contrast=<contrast> column containing fold changes [default: contrast] Arguments: grp2file input file " -> doc library(docopt) opt <- docopt(doc)
and all selected target
creates folder structure with HTML files
visualizing the ORA and GSEA results
e.g. GO Bioprocess, GO Molecular Function
- These files are linked from an index.html
- can easily be stored and delivered as part of analysis.
.img-right[
]
.img-left[
]
.img-right[
]
background-image: url("../inst/images/brussels.jpg")
enrichr
, topGO
, ?)Paolo Nanni, Christian Panse, Ralph Schlapbach, Tobias Kockmann
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.