RcisTarget: cisTarget
In aertslab/RcisTarget: RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

cisTarget

R Documentation

cisTarget

Description

Identifies DNA motifs significantly over-represented in a gene-set.

This is the main function to run RcisTarget. It includes on the following steps:

1. Motif enrichment analysis (calcAUC)
2. Motif-TF annotation (addMotifAnnotation)
3. Selection of significant genes (addSignificantGenes)

Usage

cisTarget(
  geneSets,
  motifRankings,
  motifAnnot = NULL,
  motifAnnot_highConfCat = c("directAnnotation", "inferredBy_Orthology"),
  motifAnnot_lowConfCat = c("inferredBy_MotifSimilarity",
    "inferredBy_MotifSimilarity_n_Orthology"),
  highlightTFs = NULL,
  nesThreshold = 3,
  aucMaxRank = 0.05 * ncol(motifRankings),
  geneErnMethod = "aprox",
  geneErnMaxRank = 5000,
  nCores = 1,
  verbose = TRUE
)

Arguments

`geneSets`	List of gene-sets to analyze. The gene-sets should be provided as `GeneSet`, `GeneSetCollection` or character list (see examples).
`motifRankings`	Database of the appropiate organism and search-space (i.e. 10kbp around- or 500bp upstream the TSS). These objects are provided in separate files, which can be imported with `importRankings()`: http://pyscenic.aertslab.org/databases/mm9-500bp-upstream-7species.mc9nr.feather[mm9-500bp-upstream-7species.mc9nr] (Mouse, 500bp) http://pyscenic.aertslab.org/databases/mm9-tss-centered-10kb-7species.mc9nr.feather[mm9-tss-centered-10kb-7species.mc9nr] (Mouse, 10kbp) http://pyscenic.aertslab.org/databases/hg19-500bp-upstream-7species.mc9nr.feather[hg19-500bp-upstream-7species.mc9nr] (Human, 500bp) http://pyscenic.aertslab.org/databases/hg19-tss-centered-10kb-7species.mc9nr.feather[hg19-tss-centered-10kb-7species.mc9nr] (Human, 10kbp) -Coming soon- (Fly) See `vignette("RcisTarget")` for an exhaustive list of databases.
`motifAnnot`	Motif annotation database containing the annotations of the motif to transcription factors.
`motifAnnot_highConfCat`	Categories considered as source for 'high confidence' annotations. By default, "directAnnotation" (annotated in the source database), and "inferredBy_Orthology" (the motif is annotated to an homologous/ortologous gene).
`motifAnnot_lowConfCat`	Categories considered 'lower confidence' source for annotations. By default, the annotations inferred based on motif similarity ("inferredBy_MotifSimilarity", "inferredBy_MotifSimilarity_n_Orthology").
`highlightTFs`	Character. If a list of transcription factors is provided, the column TFinDB in the otuput table will indicate whether any of those TFs are included within the 'high-confidence' annotation (two asterisks, *) or 'low-confidence' annotation (one asterisk, ) of the motif. The vector can be named to indicate which TF to highlight for each gene-set. Otherwise, all TFs will be used for all geneSets.
`nesThreshold`	Numeric. NES threshold to calculate the motif significant (3.0 by default). The NES is calculated -for each motif- based on the AUC distribution of all the motifs for the gene-set [(x-mean)/sd]. The motifs are considered significantly enriched if they pass the the Normalized Enrichment Score (NES) threshold.
`aucMaxRank`	Threshold to calculate the AUC. In a simplified way, the AUC value represents the fraction of genes -within the top X genes in the ranking- that are included in the signature. The parameter 'aucThresholdPERC' allows to modify the percentage of genes (of the top of the ranking) that is used to perform this computation. By default it is set to 5% of the total number of genes in the rankings. Common values range from 1 to 10%.
`geneErnMethod`	"iCisTarget" or "aprox". Method to identify the highly ranked genes (see addSignificantGenes for details).
`geneErnMaxRank`	Maximum rank to take into account for the gene enrichment recovery curve (see addSignificantGenes for details).
`nCores`	Number of cores to use for computation. Note: In general, using a higher number of cores (e.g. processes) decreases overall running time. However, it also deppends on the available memory and overall system load. Setting nCores too high might also decrease performance.
`verbose`	Should the function show progress messages? (TRUE / FALSE)

Value

data.table containing the over-represented motifs (according to the selected NES threshold), their statistics, annotation to transcription factors and the genes with high enrichment of the motif.

Examples



# Example for running RcisTarget using cisTarget() function (workflow wrapper)

## Not run: 

##################################################
### Load your gene sets
# As example, the package includes an Hypoxia gene set:
txtFile <- paste(file.path(system.file('examples', package='RcisTarget')),
                 "hypoxiaGeneSet.txt", sep="/")
geneLists <- list(hypoxia=read.table(txtFile, stringsAsFactors=FALSE)[,1])

### Load databases
# Motif rankings: Select according to organism and distance around TSS
# (See the vignette for URLs to download)
motifRankings <- importRankings("hg19-500bp-upstream-7species.mc9nr.feather")

## Motif - TF annotation:
data(motifAnnotations_hgnc_v9) # human TFs (for motif collection 9)
motifAnnotation <- motifAnnotations_hgnc_v9
##################################################

# Run (R)cisTarget
motifEnrichmentTable_wGenes <- cisTarget(geneLists, motifRankings,
  motifAnnot_direct=hg19_direct_motifAnnotation,
  nesThreshold=3.5, geneErnMethod="aprox", nCores=2)


## End(Not run)

# Load results from analysis
load(paste(file.path(system.file('examples', package='RcisTarget')),
           "motifEnrichmentTable_wGenes.RData", sep="/"))


### Exploring the output:
# Note: If using the fake-database, the results are not meaningful

# Number of enriched motifs (Over the given NES threshold)
nrow(motifEnrichmentTable_wGenes)

# Available info (columns)
colnames(motifEnrichmentTable_wGenes)

# The object returned is a data.table (for faster computation),
# which has a diferent syntax from the standard data.frame or matrix
# Feel free to convert it to a data.frame (as.data.frame())
class(motifEnrichmentTable_wGenes)
motifEnrichmentTable_wGenes[,1:5]

# Enriched genes
enrGenes <- as.character(motifEnrichmentTable_wGenes[1,"enrichedGenes"])
strsplit(enrGenes, ";")


# Interactive exploration
motifEnrichmentTable_wGenes <- addLogo(motifEnrichmentTable_wGenes)
DT::datatable(motifEnrichmentTable_wGenes[,1:9], escape = FALSE, filter="top",
              options=list(pageLength=5))
# Note: If using the fake database, the results of this analysis are meaningless

aertslab/RcisTarget documentation built on March 7, 2024, 11:21 p.m.

aertslab/RcisTarget index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

aertslab/RcisTarget
RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget: cisTarget
In aertslab/RcisTarget: RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

cisTarget

Description

Usage

Arguments

Value

See Also

Examples

Related to RcisTarget in aertslab/RcisTarget...

R Package Documentation

Browse R Packages

We want your feedback!

aertslab/RcisTarget RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget: cisTarget In aertslab/RcisTarget: RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

cisTarget

Description

Usage

Arguments

Value

See Also

Examples

Related to RcisTarget in aertslab/RcisTarget...

R Package Documentation

Browse R Packages

We want your feedback!

aertslab/RcisTarget
RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions

RcisTarget: cisTarget
In aertslab/RcisTarget: RcisTarget Identify transcription factor binding motifs enriched on a list of genes or genomic regions