Description Usage Arguments Details Value References Examples
Calculate motif enrichment using one of available scoring algorithms and background corrections.
1 2 3 4 5 6 7 8 9 10 11 | motifEnrichment(
sequences,
pwms,
score = "autodetect",
bg = "autodetect",
cutoff = NULL,
verbose = TRUE,
motif.shuffles = 30,
B = 1000,
group.only = FALSE
)
|
sequences |
the sequences to be scanned for enrichment. Can be either a single sequence (an object of class DNAString), or a list of DNAString objects, or a DNAStringSet object. |
pwms |
this parameter can take multiple values depending on the scoring scheme and background correction used.
When the
|
score |
this parameter determines which scoring scheme to use. Following scheme as available:
|
bg |
this parameter determines how the raw score is compared to the background distribution.
|
cutoff |
the score cutoff for a significant motif hit if scoring scheme "cutoff" is selected. |
verbose |
if to print verbose output |
motif.shuffles |
number of times to shuffle motifs if using "ms" background correction |
B |
number of replicates when calculating empirical P-value |
group.only |
if to return statistics only for the group of sequences, not individual sequences. In the case of empirical background the P-values for individual sequences are not calculated (thus saving time), for other backgrounds they are calculated but not returned. |
This function provides and interface to all algorithms available in PWMEnrich to find motif enrichment in a single or a group of sequences with/without background correction.
Since for all algorithms the first step involves calculating raw scores without background correction, the output always contains the scores without background correction together with (optional) background-corrected scores.
Unless otherwise specified the scores are returned both separately for each sequence (without/with background) and for the whole group of sequences (without/with background).
To use a background correction you need to supply a set of PWMs with precompiled background distribution parameters
(see function makeBackground
). When such an object is supplied as the pwm
parameter, the scoring
scheme and background correction are automatically determined.
There are additional packages with already pre-computed background (e.g. see package PWMEnrich.Dmelanogaster.background
).
Please refer to (Stojnic & Adryan, 2012) for more details on the algorithms.
a MotifEnrichmentResults object containing a subset following elements:
"score" - scoring scheme used
"bg" - background correction used
"params" - any additional parameters
"sequences" - the set of sequences used
"pwms" - the set of pwms used
"sequence.nobg" - per-sequence scores without any background correction. For "affinity" and "clover" a matrix of mean affinity scores; for "cutoff" number of significant hits above a cutoff
"sequence.bg" - per-sequence scores after background correction. For "logn" and "pval" the P-value (smaller is better); for "z" and "ms" background corrections the z-scores (bigger is better).
"group.nobg" - aggregate scores for the whole group of sequences without background correction. For "affinity" and "clover" the mean affinity over all sequences in the set; for "cutoff" the total number of hits in all sequences.
"group.bg" - aggregate scores for the whole group of sequences with background correction. For "logn" and "pval", the P-value for the whole group (smaller is better); for "z" and "ms" the z-score for the whole set (bigger is better).
"sequence.norm" - (only for "logn") the length-normalized scores for each of the sequences. Currently only implemented for "logn", where it returns the values normalized from LogN(0,1) distribution
"group.norm" - (only for "logn") similar to sequence.norm, but for the whole group of sequences
R. Stojnic & B. Adryan: Identification of functional DNA motifs using a binding affinity lognormal background distribution, submitted.
MC Frith et al: Detection of functional DNA motifs via statistical over-representation, Nucleid Acid Research (2004).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | if(requireNamespace("PWMEnrich.Dmelanogaster.background")){
###
# load the pre-compiled lognormal background
data(PWMLogn.dm3.MotifDb.Dmel, package = "PWMEnrich.Dmelanogaster.background")
# scan two sequences for motif enrichment
sequences = list(DNAString("GAAGTATCAAGTGACCAGTAGATTGAAGTAGACCAGTC"),
DNAString("AGGTAGATAGAACAGTAGGCAATGGGGGAAATTGAGAGTC"))
res = motifEnrichment(sequences, PWMLogn.dm3.MotifDb.Dmel)
# most enriched in both sequences (lognormal background P-value)
head(motifRankingForGroup(res))
# most enriched in both sequences (raw affinity, no background)
head(motifRankingForGroup(res, bg=FALSE))
# most enriched in the first sequence (lognormal background P-value)
head(motifRankingForSequence(res, 1))
# most enriched in the first sequence (raw affinity, no background)
head(motifRankingForSequence(res, 1, bg=FALSE))
###
# Load the pre-compiled background for hit-based motif counts with cutoff of P-value = 0.001
data(PWMPvalueCutoff1e3.dm3.MotifDb.Dmel, package = "PWMEnrich.Dmelanogaster.background")
res.count = motifEnrichment(sequences, PWMPvalueCutoff1e3.dm3.MotifDb.Dmel)
# Enrichment in the whole group, z-score for the number of motif hits
head(motifRankingForGroup(res))
# First sequence, sorted by number of motif hits with P-value < 0.001
head(motifRankingForSequence(res, 1, bg=FALSE))
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.