knitr::opts_knit$set(root.dir = ".")
knitr::opts_chunk$set(collapse = TRUE, warning = TRUE)
BiocStyle::markdown()
library("BiocStyle")
library("GSEAdv")
fl <- system.file("extdata", "Broad.xml", package = "GSEABase")
gss <- getBroadSets(fl)

Introduction

This vignette assumes that the reader is familiar with the infraestructure build in Bioconductor around gene sets, mainly r Biocpkg("GSEABase"). It provides examples of how to use the package to understand the relationships between genes and gene sets. The reader should be also familiarized with the vignette r Biocpkg("GSEAdv", "vignette_describe.html", "describing") the relationship between genes and gene sets.

Isolation

If we want to check if a GeneSetCollection is not a group of isolated genes without relation between different gene sets we can use the function isolation

isolation(gss)
isolation(Info)

Distribution of genes and pathways by size:

This functions calculate for each gene or pathway how are their elements distributed:

sizeGenes(Info)
sizePathways(Info)

We can see that genes are distributed in pathways of size 2 or 3, but pathways have genes in either 1, 2 or 4 pathways. If we are only interested in how many different sizes they are distributed we can use the following functions:

Diferences in genes or pathways

These two functions evaluate how many elements of different length they are involved in

sizesPerGene(Info)
sizesPerPathway(Info)
sizesPerPathway(gss)

sizesPerGene is the equivalent of counting how many rows are not empty on the sizeGenes matrix, conversely sizesPerPathway.

Relationships between number of genes per pathway and number of pathways per gene

To see what is the conditional probability between the number of pathways gene in n number of pathways are involved $P( pathways_{genes} | genes_{pathway} )$ there is the function condPerGene. While the opposite $P( genes_{pathway} | pathways_{genes} )$) can be done with condPerPathways

condPerGenes(Info)
condPerPathways(Info)

Number of pathways and genes

The double.factorial calculates can help to calculate how many possibilities there are:

double.factorial(nGenes(Info))
double.factorial(nPathways(Info))

It calculates given the number of genes it calculates the number of pathways that can be made with them and the opposite, with the amount of pathways it calculates the amount of genes that ther could be.

Other vignettes

In case you missed, here I link previous vignettes and a new vignette

To provide insight about the relationship between genes and gene sets.

Given some approximations of the relationships between genes and gene sets creates a new gene set collection.

Session info {.unnumbered}

sessionInfo()


llrs/GSEAdv documentation built on May 29, 2019, 6 p.m.