Introduction

DoRothEA is a comprehensive resource containing a curated collection of transcription factors (TFs) and its transcriptional targets. The set of genes regulated by a specific transcription factor is known as regulon. DoRothEA's regulons were gathered from different types of evidence. Each TF-target interaction is defined by a confidence level based on the number of supporting evidence. The confidence levels ranges from A (highest confidence) to E (lowest confidence) [@GarciaAlonso2019]. While DoRothEA was originally developed for the application on human data it can be applied also on mouse data with comparable performace but better coverage than dedicated mouse regulons [@Holland2019].

DoRothEA regulons are usually coupled with the statistical method VIPER [@Alvarez2016]. In this context, TF activities are computed based on the mRNA expression levels of its targets. We therefore can consider TF activity as a proxy of a given transcriptional state [@Dugourd2019]. However, it is up to the user to decide which statistcal method to use. Alternatives could be for instance classical Gene Set Enrichment Analysis or simply mean statistic.

Installation

First of all, you need a current version of R (http://www.r-project.org). In addition you need r Biocpkg("dorothea"), a freely available package deposited on http://bioconductor.org/ and https://github.com/saezlab/dorothea.

You can install it by running the following commands on an R console:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("dorothea")

We also load here the packages required to run this vignette

## We load the required packages
library(dorothea)
library(bcellViper)
library(dplyr)
library(viper)

Example of usage

According to the vignette from the r Biocpkg("viper") package we demonstrate how to combine viper with regulons from DoRothEA.

Accessing example data and DoRothEA regulons.

Similiar to the r Biocpkg("viper") vignette we use the gene expression matrix from the r Biocpkg("bcellViper") package. Click here for more information about the gene expression matrix. The regulons from DoRothEA are provided within the dorothea package and can be acessed via the data() function. As the gene expression matrix contains human data we also load the human version of DoRothEA.

# accessing expression data from bcellViper
data(bcellViper, package = "bcellViper")

# acessing (human) dorothea regulons
# for mouse regulons: data(dorothea_mm, package = "dorothea")
data(dorothea_hs, package = "dorothea")

Running VIPER with DoRothEA regulons

We implemented a wrapper for the viper function that can deal with different input types such as matrix, dataframe, ExpressionSet or Seurat objects (see dedicated vignette for single-cell analysis). We subset DoRothEA to the confidence levels A and B to include only the high quality regulons.

regulons = dorothea_hs %>%
  filter(confidence %in% c("A", "B"))

tf_activities <- run_viper(dset, regulons, 
                           options =  list(method = "scale", minsize = 4, 
                                           eset.filter = FALSE, cores = 1, 
                                           verbose = FALSE))

Session info

sessionInfo()

References



t-stei/dorothea documentation built on March 19, 2022, 7:26 a.m.