encodemotif: MotifDb object containing motif information from the known...

encodemotifR Documentation

MotifDb object containing motif information from the known and discovered motifs for the ENCODE TF ChIP-seq datasets.

Description

From the abstract: "Recent advances in technology have led to a dramatic increase in the number of available transcription factor ChIP-seq and ChIP-chip data sets. Understanding the motif content of these data sets is an important step in understanding the underlying mechanisms of regulation. Here we provide a systematic motif analysis for 427 human ChIP-seq data sets using motifs curated from the literature and also discovered de novo using five established motif discovery tools. We use a systematic pipeline for calculating motif enrichment in each data set, providing a principled way for choosing between motif variants found in the literature and for flagging potentially problematic data sets. Our analysis confirms the known specificity of 41 of the 56 analyzed factor groups and reveals motifs of potential cofactors. We also use cell type-specific binding to find factors active in specific conditions. The resource we provide is accessible both for browsing a small number of factors and for performing large-scale systematic analyses. We provide motif matrices, instances and enrichments in each of the ENCODE data sets. The motifs discovered here have been used in parallel studies to validate the specificity of antibodies, understand cooperativity between data sets and measure the variation of motif binding across individuals and species."

Usage

encodemotif

Format

MotifDb object of length 2064; to access metadata use mcols(encodemotif)

providerName

Name provided by ENCODE

providerId

Same as providerName

dataSource

"ENCODE-motif"

geneSymbol

Gene symbol for the transcription factor

geneId

Entrez gene id for the transcription factor

geneIdType

"ENTREZ"

proteinId

UNIPROT id for the transcription factor

proteinIdType

"UNIPROT"

organism

"Hsapiens"

sequenceCount

NA not available

bindingSequence

Consensus sequence for the motif

bindingDomain

NA incomplete

tfFamily

NA incomplete

experimentType

occurs in two forms:

For motifs that were discovered in this study, the format is cellType_source-LabMetadata:MotifFinder#Location for example H1-hESC_encode-Myers_seq_hsa_v041610.2_r1:MEME#2#Intergenic.

For motifs that were "known" the format tends to be TF_source_sourceId for example AP1_jaspar_MA0099.2.

pubmedID

"24335146" see Source for more details

Details

Load with data(encodemotif)

Value

MotifList-class object

Source

Pouya Kheradpour and Manolis Kellis (2013 December 13) Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. Nucleic Acids Research, doi:10.1093/nar/gkt1249

See Also

http://compbio.mit.edu/encode-motifs/

Examples

data(encodemotif)
encodemotif

Simon-Coetzee/motifBreakR documentation built on Aug. 6, 2024, 5:17 a.m.