Description Usage Arguments Details Value Examples
reduceSimMatrix Reduce a set of GO terms based on their semantic similarity and scores.
1 | reduceSimMatrix(simMatrix, scores = NULL, threshold = 0.7, orgdb)
|
simMatrix |
a (square) similarity matrix |
scores |
*named* vector with scores (weights) assigned to each term. Higher is better. Can be NULL (default, means no scores. In this case, a default score based on set size is assigned, thus favoring larger sets). Note: if you have p-values as scores, consider -1*log-transforming them ('-log(p)') |
threshold |
similarity threshold (0-1). Some guidance: Large (allowed similarity=0.9), Medium (0.7), Small (0.5), Tiny (0.4) Defaults to Medium (0.7) |
orgdb |
one of org.* Bioconductor packages (the package name, or the orgdb object itself) |
Currently, rrvgo uses the similarity between pairs of terms to compute a distance matrix, defined as (1-simMatrix). The terms are then hierarchically clustered using complete linkage, and the tree is cut at the desired threshold, picking the term with the highest score as the representative of each group.
Therefore, higher thresholds lead to fewer groups, and the threshold should be read as the expected similarity of terms within a group (though this is not entirely correct, and you'll see similarities below this threshold being put in the same group).
a data.frame with all terms and it's "reducer" (NA if the term was not reduced)
1 2 3 4 | go_analysis <- read.delim(system.file("extdata/example.txt", package="rrvgo"))
simMatrix <- calculateSimMatrix(go_analysis$ID, orgdb="org.Hs.eg.db", ont="BP", method="Rel")
scores <- setNames(-log10(go_analysis$qvalue), go_analysis$ID)
reducedTerms <- reduceSimMatrix(simMatrix, scores, threshold=0.7, orgdb="org.Hs.eg.db")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.