gosummaries: Constructor for gosummaries object

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/GOsummaries.R

Description

Constructor for gosummaries object that contains all the necessary information to draw the figure, like gene lists and their annotations, expression data and all the relevant texts.

Usage

1
2
3
4
5
6
7
8
9
gosummaries(x = NULL, ...)

## Default S3 method:
gosummaries(x = NULL, wc_data = NULL,
  organism = "hsapiens", go_branches = c("BP", "keg", "rea"),
  max_p_value = 0.01, min_set_size = 50, max_set_size = 1000,
  max_signif = 40, ordered_query = TRUE, hier_filtering = "moderate",
  score_type = "p-value", wc_algorithm = "middle",
  wordcloud_legend_title = NULL, ...)

Arguments

x

list of arrays of gene names (or list of lists of arrays of gene names)

...

additional parameters for gprofiler function

wc_data

precalculated GO enrichment results (see Details)

organism

the organism that the gene lists correspond to. The format should be as follows: "hsapiens", "mmusculus", "scerevisiae", etc.

go_branches

GO tree branches and pathway databases as denoted in g:Profiler (Possible values: BP, CC, MF, keg, rea)

max_p_value

threshold for p-values that have been corrected for multiple testing

min_set_size

minimal size of functional category to be considered

max_set_size

maximal size of functional category to be considered

max_signif

maximal number of categories returned per query

ordered_query

logical showing if the lists are ordered or not (it determines if the ordered query algorithm is used in g:Profiler)

hier_filtering

a type of hierarchical filtering used when reducing the number of g:Profiler results (see gprofiler for further information)

score_type

indicates the type of scores in wc_data. Possible values: "p-value" and "count"

wc_algorithm

the type of wordcloud algorithm used. Possible values are "top" that puts first word to the top corner and "middle" that puts first word to the middle.

wordcloud_legend_title

title of the word cloud legend, should reflect the nature of the score

Details

The object is a list of "components", with each component defined by a gene list or a pair of gene lists. Each "component" has the slots as follows:

Some visual parameters are stored in the attributes of gosummaries object: score_type tells how to handle the scores associated to wordclouds, wc_algorithm specifies the wordcloud layout algorithm and wordcloud_legend_title specifies the title of the wordcloud. One can change them using the attr function.

The word clouds are specified as data.frames with two columns: "Term" and "Score". If one wants to use custom data for wordclouds, instead of the default GO enrichment results, then this is possible to specify parameter wc_data. The input structure is similar to the gene list input, only instead of gene lists one has the two column data.frames.

The GO enrichment analysis is performed using g:Profiler web toolkit and its associated R package gProfileR. This means the computer has to have internet access to annotate the gene lists. Since g:Profiler can accept a wide range of gene IDs then user usually does not have to worry about converitng the gene IDs into right format. To be absolutely sure the tool recognizes the gene IDs one can check if they will give any results in http://biit.cs.ut.ee/gprofiler/gconvert.cgi.

There can be a lot of results for a typical GO enrichment analysis but usually these tend to be pretty redundant. Since one can fit only a small number of categories into a word cloud we have to bring down the number of categories to show an reduce the redundancy. For this we use hierarchical filtering option \"moderate\" in g:Profiler. In g:Profiler the categories are grouped together when they share one or more enriched parents. The \"moderate\" option selects the most significant category from each of such groups. (See more at http://biit.cs.ut.ee/gprofiler/)

The slots of the object can be filled with custom information using a function add_to_slot.gosummaries.

By default the Data slot is filled with a dataset that contains the number of genes in the Gene_lists slot. Expression data can be added to the object for example by using function add_expression.gosummaries. It is possible to derive your own format for the Data slot as well, as long as a panel plotting function for this data is alaso provided (See panel_boxplot for further information).

There are several constructors of gosummaries object that work on common analysis result objects, such as gosummaries.kmeans, gosummaries.MArrayLM and gosummaries.prcomp corresponding to k-means, limma and PCA results.

Value

A gosummaries type of object

Author(s)

Raivo Kolde <raivo.kolde@eesti.ee>

Raivo Kolde <raivo.kolde@eesti.ee>

See Also

gosummaries.kmeans, gosummaries.MArrayLM, gosummaries.prcomp

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
## Not run: 
# Define gene lists 
genes1 = c("203485_at", "209469_at", "209470_s_at", "203999_at", 
"205358_at", "203130_s_at", "210222_s_at", "202508_s_at", "203001_s_at", 
"207957_s_at", "203540_at", "203000_at", "219619_at", "221805_at", 
"214046_at", "213135_at", "203889_at", "209990_s_at", "210016_at", 
"202507_s_at", "209839_at", "204953_at", "209167_at", "209685_s_at",  
"211276_at", "202391_at", "205591_at", 
"201313_at")
genes2 = c("201890_at", "202503_s_at", "204170_s_at", "201291_s_at", 
"202589_at", "218499_at", "209773_s_at", "204026_s_at", "216237_s_at", 
"202546_at", "218883_s_at", "204285_s_at", "208659_at", "201292_at", 
"201664_at")


gl1 = list(List1 = genes1,  List2 = genes2) # One list per component
gl2 = list(List = list(genes1, genes2)) # Two lists per component

# Construct gosummaries objects
gs1 = gosummaries(gl1)
gs2 = gosummaries(gl2)

plot(gs1, fontsize = 8)
plot(gs2, fontsize = 8)

# Changing slot contents using using addToSlot.gosummaries 
gs1 = add_to_slot.gosummaries(gs1, "Title", list("Neurons", "Cell lines"))

# Adding expression data
data(tissue_example)

gs1 = add_expression.gosummaries(gs1, exp = tissue_example$exp, annotation = 
tissue_example$annot)
gs2 = add_expression.gosummaries(gs2, exp = tissue_example$exp, annotation = 
tissue_example$annot)

plot(gs1, panel_par = list(classes = "Tissue"), fontsize = 8)
plot(gs2, panel_par = list(classes = "Tissue"), fontsize = 8)

## End(Not run)

# Using custom annotations for word clouds
wcd1 = data.frame(Term = c("KLF1", "KLF2", "POU5F1"), Score = c(0.05, 0.001, 
0.0001))
wcd2 = data.frame(Term = c("CD8", "CD248", "CCL5"), Score = c(0.02, 0.005, 
0.00001))

gs = gosummaries(wc_data = list(Results1 = wcd1, Results2 = wcd2))
plot(gs)

gs = gosummaries(wc_data = list(Results = list(wcd1, wcd2)))
plot(gs)

# Adjust wordcloud legend title
gs = gosummaries(wc_data = list(Results = list(wcd1, wcd2)), 
wordcloud_legend_title = "Significance score")
plot(gs)

raivokolde/GOsummaries documentation built on May 26, 2019, 9:55 p.m.