motif_peaks: Look for overrepresented motif position peaks in a set of...
In universalmotif: Import, Modify, and Export Motifs with R

Description Usage Arguments Details Value Author(s) References See Also Examples

Using the motif position data from scan_sequences() (or elsewhere), test whether certain positions in the sequences have significantly higher motif density.

1 2	motif_peaks(hits, seq.length, seq.count, bandwidth, max.p = 1e-06, peak.width = 3, nrand = 100, plot = TRUE, BP = FALSE)

`hits`	`numeric` A vector of sequence positions indicating motif sites.
`seq.length`	`numeric(1)` Length of sequences. Only one number is allowed, as all sequences must be of identical length. If missing, then the largest number from `hits` is used.
`seq.count`	`numeric(1)` Number of sequences with motif sites. If missing, then the number of unique values in `hits` is used.
`bandwidth`	`numeric(1)` Peak smoothing parameter. Smaller numbers will result in skinnier peaks, larger numbers will result in wider peaks. Leaving this empty will cause `motif_peaks()` to generate one by itself (see 'details').
`max.p`	`numeric(1)` Maximum P-value allowed for finding significant motif site peaks.
`peak.width`	`numeric(1)` Minimum peak width. A peak is defined as as the highest point within the value set by `peak.width`.
`nrand`	`numeric(1)` Number of random permutations for generating a null distribution. In order to calculate P-values, a set of random motif site positions are generated `nrand` times.
`plot`	`logical(1)` Will create a `ggplot2` object displaying motif peaks.
`BP`	`logical(1)` Allows for the use of BiocParallel within `motif_peaks()`. See `BiocParallel::register()` to change the default backend. Setting `BP = TRUE` is only recommended for exceptionally large jobs. Keep in mind that this function will not attempt to limit its memory usage.

Kernel smoothing is used to calculate motif position density. The implementation for this process is based on code from the KernSmooth R package \insertCitekernuniversalmotif. These density estimates are used to determine peak locations and heights. To calculate the P-values of these peaks, a null distribution is calculated from peak heights of randomly generated motif positions.

If the bandwidth option is not supplied, then the following code is used (from KernSmooth):

del0 <- (1 / (4 * pi))^(1 / 10)

bandwidth <- del0 * (243 / (35 * length(hits)))^(1 / 5) * sqrt(var(hits))

A DataFrame with peak positions and P-values. If plot = TRUE, then a list is returned with the DataFrame as the first item and the ggplot2 object as the second item.

Benjamin Jean-Marie Tremblay, b2tremblay@uwaterloo.ca

\insertRef

kernuniversalmotif

scan_sequences()

data(ArabidopsisMotif)
data(ArabidopsisPromoters)
if (R.Version()$arch != "i386") {
hits <- scan_sequences(ArabidopsisMotif, ArabidopsisPromoters, RC = FALSE)
res <- motif_peaks(as.vector(hits$start), 1000, 50)
# View plot:
res$Plot

# The raw plot data can be found in:
res$Plot$data
}