summarize_clusters: summarize_clusters

Description Usage Arguments Details Value Examples

View source: R/generics.R

Description

Summarize the terms of the cluster using pagerank algorithm

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## S3 method for class 'BOWER'
summarize_clusters(
  bower,
  cluster = NULL,
  pattern = NULL,
  sep = NULL,
  ncpus = NULL,
  disconnect_graph = FALSE,
  ...
)

## S3 method for class 'igraph'
summarize_clusters(
  graph,
  cluster = NULL,
  pattern = NULL,
  sep = NULL,
  ncpus = NULL,
  disconnect_graph = FALSE,
  ...
)

Arguments

cluster

vector of cluster labels for each geneset.

pattern

search pattern to remove from the terms. Unless specified, will default to built-in pattern.

sep

separator used/found in gene set names to be changed to blank spaces. Default value is underscore ('_').

ncpus

number of cores used for parallelizing reconstruction.

disconnect_graph

return a graph connecting only nodes in a cluster.

...

passed to textrank::textrank_sentences.

graph

geneset overlap graph.

Details

Given a list of text, it creates a sparse matrix consisting of tf-idf score for tokens from the text. See https://github.com/saraswatmks/superml/blob/master/R/TfidfVectorizer.R. A k shortest-nearest neighbor graph is then computed using the overlap of of the terms.

Value

Returns a matrix of tf-idf score of tokens.

Examples

1
2
3
4
5
6
gmt_file <- system.file("extdata", "h.all.v7.4.symbols.gmt", package = "bowerbird")
bwr <- bower(gmt_file)
bwr <- snn_graph(bwr)
bwr <- find_clusters(bwr)
bwr <- summarize_clusters(bwr, ncpus = 1)
bwr

clatworthylab/bowerbird documentation built on Dec. 19, 2021, 5:15 p.m.