go_reduce | R Documentation |
This function will reduce GO redundancy first by creating a
semantic similarity matrix (using
GOSemSim::mgoSim
), which is then passed
through rrvgo::reduceSimMatrix()
,
which will reduce a set of GO terms based on their semantic similarity and
scores (in this case, a default score based on set size is assigned.)
go_reduce( pathway_df, orgdb = "org.Hs.eg.db", threshold = 0.7, scores = NULL, measure = "Wang" )
pathway_df |
a
|
orgdb |
|
threshold |
|
scores |
named vector, with scores (weights) assigned to each
term. Higher is better. Can be NULL (default, means no scores. In this case,
a default score based on set size is assigned, thus favoring larger sets).
Note: if you have p-values as scores, consider log-transforming them
( |
measure |
|
Semantic similarity is calculated using the "Wang" method, a
graph-based strategy to compute semantic similarity using the topology of
the GO graph structure. GOSemSim::mgoSim
does permit use of other measures (primarily information-content measures),
but "Wang" is used as the default in GOSemSim (and was, thus, used as the
default here). If you wish to use a different measure, please refer to the
GOSemSim documentation.
rrvgo::reduceSimMatrix()
creates a
distance matrix, defined as (1-simMatrix). The terms are then hierarchically
clustered using complete linkage (an agglomerative, or "bottom-up"
clustering approach), and the tree is cut at the desired threshold. The term
with the highest "score" is used to represent each group.
a tibble object of pathway results, a "reduced" parent term to which pathways have been assigned. New columns:
parent_id
: the GO ID of the parent term
parent_term
: a description of the GO ID
parent_sim_score
: the similarity score between the child GO term and
its parent term
Yu et al. (2010) GOSemSim: an R package for measuring semantic similarity among GO terms and gene products Bioinformatics (Oxford, England), 26:7 976–978, April 2010. http://bioinformatics.oxfordjournals.org/cgi/content/abstract/26/7/976 PMID: 20179076
Yu (2021) Biomedical Knowledge Mining using GOSemSim and clusterProfiler https://yulab-smu.top/biomedical-knowledge-mining-book/index.html
Sayols S (2020). rrvgo: a Bioconductor package to reduce and visualize Gene Ontology terms. https://ssayols.github.io/rrvgo
go_plot
for plotting the output of go_reduce
,
GOSemSim::mgoSim
for calculation of semantic
similarity and
rrvgo::reduceSimMatrix()
for reduction
of similarity matrix
Other GO-related functions:
go_plot()
file_path <- system.file( "testdata", "go_test_data.txt", package = "rutils", mustWork = TRUE ) pathway_df <- readr::read_delim(file_path, delim = "\t" ) go_reduce( pathway_df = pathway_df, orgdb = "org.Hs.eg.db", threshold = 0.9, scores = NULL, measure = "Wang" )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.