plotMotifDensityMap: Plotting density maps of motif occurrence

Description Usage Arguments Value Author(s) References See Also Examples

Description

Plots density of motif occurrences in an ordered set of sequences of the same length in the form of a two dimensional map centered at a common reference position. Motif is specified by a position weight matrix (PWM) that contains estimated probability of base b at position i, and only motif hits above specified threshold are taken into account and plotted.

Usage

1
2
3
4
5
6
7
plotMotifDensityMap(regionsSeq, motifPWM, minScore = "80%",
    seqOrder = c(1:length(regionsSeq)), flankUp = NULL, flankDown = NULL,
    nBin = NULL, bandWidth = NULL, color = "blue", transf = NULL, xTicks = NULL,
    xTicksAt = NULL, xLabel = "", yTicks = NULL, yTicksAt = NULL, yLabel = "",
    cexAxis = 8, plotScale = TRUE, scaleLength = NULL, scaleWidth = 15,
    addReferenceLine = TRUE, plotColorLegend = TRUE, outFile = "DensityMap",
    plotWidth = 2000, plotHeight = 2000)

Arguments

regionsSeq

A DNAStringSet object. Set of sequences of the same length for which the motif occurrence density should be visualised.

motifPWM

A numeric matrix representing the Position Weight Matrix (PWM), such as returned by PWM function. Can contain either probabilities or log2 probability ratio of base b at position i.

minScore

The minimum score for counting a motif hit. Can be given as a character string containing a percentage (e.g. "85%") of the PWM score or a single number specifying score threshold. If a percentage is given, it is converted to a score value taking into account both minimal and maximal possible PWM scores as follows: minPWMscore + percThreshold/100 * (maxPWMscore - minPWMscore) This differs from the formula in the matchPWM function from the Biostrings package which takes into account only the maximal possible PWM score and considers the given percentage as the percentage of that maximal score: percThreshold/100 * maxPWMscore

seqOrder

Integer vector specifying the order of the provided input sequences. Must have the same length as the number of sequences in the regionSeq. Input sequences will be sorted according to this index in an ascending order form top to the bottom of the plot, i.e. the sequence labeled with the lowest number will appear at the top of the plot. The default value will order the sequences as they are ordered in the input regionSeq object.

flankUp, flankDown

The number of base-pairs upstream and downstream of the reference position in the provided sequences, respectively. flankUp + flankDown must sum up to the length of the sequences. If no values are provided both flankUp and flankDown are set to be half of the length of the input sequences, i.e. the reference position is assumed to be in the middle of the sequences.

nBin

Numeric vector with two values containing the number of equally spaced points in each direction over which the density is to be estimated. The first value specifies number of bins along x-axis, i.e. along the nucleotides in the sequence, and the second value specifies the number of bins along y-axis, i.e. across ordered input sequences. The values are passed on to the gridsize argument of the bkde2D function to compute a 2D binned kernel density estimate. If nBin is not specified it will default to c(n, m), where n is the number of input sequences and m is the length of sequences.

bandWidth

Numeric vector of length 2, containing the bandwidth to be used in each coordinate direction. The first value specifies the bandwidth along the x-axis, i.e. along the nucleotides in the sequence, and the second value specifies the bandwidth along y-axis, i.e. across ordered input sequences. The values are passed on to the bandwidth argument of the bkde2D function to compute a 2D binned kernel density estimate and are used as standard deviation of the bivariate Gaussian kernel. If bandWidth is not specified it will default to c(3,3).

color

Character specifying the color palette for the density plot. One of the following color palettes can be specified: "blue", "brown", "cyan", "gold", "gray", "green", "pink", "purple", "red". Please refer to the vignette for the appearance of these palettes.

transf

The function mapping the density scale to the color scale. See Details.

xTicks

Character vector of labels to be placed at the tick-marks on x-axis. The default NULL value produces five tick-marks: one at the reference point and two equally spaced tick-marks both upstream and downstream of the reference point.

xTicksAt

Numeric vector of positions of the tick-marks on the x-axis. The values can range from 1 (the position of the first base-pair in the sequence) to input sequence length. The default NULL value produces five tick-marks: one at the reference point and two equally spaced tick-marks both upstream and downstream of the reference point.

xLabel

The label for the x-axis. The default is no label, i.e. empty string.

yTicks

Character vector of labels to be placed at the tick-marks on y-axis. The default NULL value produces no tick-marks and labels.

yTicksAt

Numeric vector of positions of the tick-marks on the y-axis. The values can range from 1 (the position of the last sequence on the bottom of the plot) to input sequence length (the position of the first sequence on the top of the plot). The default NULL value produces no tick-marks.

yLabel

The label for the y-axis. The default is no label, i.e. empty string.

cexAxis

The magnification to be used for axis annotation.

plotScale

Logical, should the scale bar be plotted in the lower left corner of the plot.

scaleLength

The length of the scale bar to be plotted. Used only when plotScale = TRUE. If no value is provided, it defaults to one fifth of the input sequence length.

scaleWidth

The width of the line for the scale bar. Used only when plotScale = TRUE.

addReferenceLine

Logical, should the vertical dashed line be drawn at the reference point.

plotColorLegend

Logical, should the color legend for the pattern density be plotted. If TRUE a separate .png file named outFile."ColorLegend.png" will be created, showing mapping of pattern density values to colours.

outFile

Character vector specifying the base name of the output plot file. The final name of the plot file for each pattern will be outFile."pattern.jpg".

plotWidth, plotHeight

Width and height of the density plot in pixels.

Value

The function produces a PNG file in the working directory, visualising density of the motif occurrence above specified threshold in the set of ordered input sequences.

Author(s)

Vanja Haberle

References

Haberle et al. (2014) Two independent transcription initiation codes overlap on vertebrate core promoters, Nature 507:381-385.

See Also

motifScanHits
plotPatternDensityMap

Examples

1
2
3
4
5
6
7
8
9
library(GenomicRanges)
load(system.file("data", "zebrafishPromoters.RData", package="seqPattern"))
promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth

load(system.file("data", "TBPpwm.RData", package="seqPattern"))

plotMotifDensityMap(regionsSeq = zebrafishPromoters, motifPWM = TBPpwm,
                    minScore = "85%", seqOrder = order(promoterWidth),
                    flankUp = 400, flankDown = 600, color = "red")

seqPattern documentation built on Nov. 8, 2020, 7:52 p.m.