View source: R/02_addMotifAnnotation.R
addMotifAnnotation | R Documentation |
Select significant motifs and/or annotate motifs to genes or transcription factors. The motifs are considered significantly enriched if they pass the the Normalized Enrichment Score (NES) threshold.
addMotifAnnotation(
auc,
nesThreshold = 3,
digits = 3,
motifAnnot = NULL,
motifAnnot_highConfCat = c("directAnnotation", "inferredBy_Orthology"),
motifAnnot_lowConfCat = c("inferredBy_MotifSimilarity",
"inferredBy_MotifSimilarity_n_Orthology"),
idColumn = "motif",
highlightTFs = NULL,
keepAnnotationCategory = TRUE
)
auc |
Output from calcAUC. |
nesThreshold |
Numeric. NES threshold to calculate the motif significant (3.0 by default). The NES is calculated -for each motif- based on the AUC distribution of all the motifs for the gene-set [(x-mean)/sd]. |
digits |
Integer. Number of digits for the AUC and NES in the output table. |
motifAnnot |
Motif annotation database containing the annotations of the motif to transcription factors. The names should match the ranking column names. |
motifAnnot_highConfCat |
Categories considered as source for 'high confidence' annotations. By default, "directAnnotation" (annotated in the source database), and "inferredBy_Orthology" (the motif is annotated to an homologous/ortologous gene). |
motifAnnot_lowConfCat |
Categories considered 'lower confidence' source for annotations. By default, the annotations inferred based on motif similarity ("inferredBy_MotifSimilarity", "inferredBy_MotifSimilarity_n_Orthology"). |
idColumn |
Annotation column containing the ID (e.g. motif, accession) |
highlightTFs |
Character. If a list of transcription factors is provided, the column TFinDB in the otuput table will indicate whether any of those TFs are included within the 'high-confidence' annotation (two asterisks, **) or 'low-confidence' annotation (one asterisk, *) of the motif. The vector can be named to indicate which TF to highlight for each gene-set. Otherwise, all TFs will be used for all geneSets. |
keepAnnotationCategory |
Include annotation type in the TF information? |
data.table
with the folowing columns:
geneSet: Name of the gene set
motif: ID of the motif (colnames of the ranking, it might be other kind of feature)
NES: Normalized enrichment score of the motif in the gene-set
AUC: Area Under the Curve (used to calculate the NES)
TFinDB: Indicates whether the highlightedTFs are included within the high-confidence annotation (two asterisks, **) or lower-confidence annotation (one asterisk, *)
TF_highConf: Transcription factors annotated to the motif based on high-confidence annotations.
TF_lowConf: Transcription factors annotated to the motif according to based on lower-confidence annotations.
Next step in the workflow: addSignificantGenes
.
Previous step in the workflow: calcAUC
.
See the package vignette for examples and more details:
vignette("RcisTarget")
##################################################
# Setup & previous steps in the workflow:
#### Gene sets
# As example, the package includes an Hypoxia gene set:
txtFile <- paste(file.path(system.file('examples', package='RcisTarget')),
"hypoxiaGeneSet.txt", sep="/")
geneLists <- list(hypoxia=read.table(txtFile, stringsAsFactors=FALSE)[,1])
#### Databases
## Motif rankings: Select according to organism and distance around TSS
## (See the vignette for URLs to download)
# motifRankings <- importRankings("hg19-500bp-upstream-7species.mc9nr.feather")
## For this example we will use a SUBSET of the ranking/motif databases:
library(RcisTarget.hg19.motifDBs.cisbpOnly.500bp)
data(hg19_500bpUpstream_motifRanking_cispbOnly)
motifRankings <- hg19_500bpUpstream_motifRanking_cispbOnly
## Motif - TF annotation:
data(motifAnnotations_hgnc_v9) # human TFs (for motif collection 9)
motifAnnotation <- motifAnnotations_hgnc_v9
### Run RcisTarget
# Step 1. Calculate AUC
motifs_AUC <- calcAUC(geneLists, motifRankings)
##################################################
### (This step: Step 2)
# Before starting: Setup the paralell computation
library(BiocParallel); register(MulticoreParam(workers = 2))
# Select significant motifs, add TF annotation & format as table
motifEnrichmentTable <- addMotifAnnotation(motifs_AUC,
motifAnnot=motifAnnotation)
# Alternative: Modifying some options
motifEnrichment_wIndirect <- addMotifAnnotation(motifs_AUC, nesThreshold=2,
motifAnnot=motifAnnotation,
highlightTFs = "HIF1A",
motifAnnot_highConfCat=c("directAnnotation"),
motifAnnot_lowConfCat=c("inferredBy_MotifSimilarity",
"inferredBy_MotifSimilarity_n_Orthology",
"inferredBy_Orthology"),
digits=3)
# Getting TFs for a given TF:
motifs <- motifEnrichmentTable$motif[1:3]
getMotifAnnotation(motifs, motifAnnot=motifAnnotation)
getMotifAnnotation(motifs, motifAnnot=motifAnnotation, returnFormat="list")
### Exploring the output:
# Number of enriched motifs (Over the given NES threshold)
nrow(motifEnrichmentTable)
# Interactive exploration
motifEnrichmentTable <- addLogo(motifEnrichmentTable)
DT::datatable(motifEnrichmentTable, filter="top", escape=FALSE,
options=list(pageLength=50))
# Note: If using the fake database, the results of this analysis are meaningless
# The object returned is a data.table (for faster computation),
# which has a diferent syntax from the standard data.frame or matrix
# Feel free to convert it to a data.frame (as.data.frame())
motifEnrichmentTable[,1:6]
##################################################
# Next step (step 3, optional):
## Not run:
motifEnrichmentTable_wGenes <- addSignificantGenes(motifEnrichmentTable,
geneSets=geneLists,
rankings=motifRankings,
method="aprox")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.