motifScanScores: Motif scanning scores for a set of ordered sequences

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Provides motif scanning scores along the full length of a sequence for a list of sequences of the same length ordered by a provided index. Motif is specified by a position weight matrix (PWM) that contains estimated probability of base b at position i and is usually constructed via call to PWM function. Scanning scores are returned in the form of a two-dimensional matrix, where the rows are sequences ordered by the specified index and the columns are relative positions within the sequence. Each cell in the matrix contains the score of the specified motif in the given sequence starting at the given position. The resulting matrix can be used to visualise motif occurrences and their strength in an ordered set of sequences centered at a common reference point.

Usage

1
2
motifScanScores(regionsSeq, motifPWM, seqOrder = c(1:length(regionsSeq)),
        asPercentage = TRUE)

Arguments

regionsSeq

A DNAStringSet object. Set of sequences of the same length to be scanned with the motif.

motifPWM

A numeric matrix representing the Position Weight Matrix (PWM), such as returned by PWM function. Can contain either probabilities or log2 probability ratio of base b at position i.

seqOrder

Integer vector specifying the order of the provided input sequences. Must have the same length as the number of sequences in the regionSeq. The default value will order the sequences as they are ordered in the input regionSeq object.

asPercentage

Logical, should the scores represent percentage of the maximal motif PWM score (TRUE) or raw scores (FALSE).

Details

This function uses the PWMscoreStartingAt function to get scores for a given motif starting at each position (nucleotide) in a set of input sequences. Input sequences must all be of the same length and are ordered according to the index provided in the seqOrder argument, creating an n * m matrix, where n is the number of sequences and m is the length of the sequences. Each cell in the matrix contains the score of the specified motif in the given sequence starting at the given position.

Value

The function returns a matrix with motif scanning scores for each position in the set of input sequences.

Author(s)

Vanja Haberle

See Also

plotMotifScanScores
motifScanHits

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(GenomicRanges)
load(system.file("data", "zebrafishPromoters.RData", package="seqPattern"))
promoterWidth <- elementMetadata(zebrafishPromoters)$interquantileWidth

load(system.file("data", "TBPpwm.RData", package="seqPattern"))

motifScores <- motifScanScores(regionsSeq = zebrafishPromoters,
                            motifPWM = TBPpwm, seqOrder = order(promoterWidth),
                            asPercentage = TRUE)
dim(motifScores)
motifScores[1:10,1:10]

seqPattern documentation built on Nov. 8, 2020, 7:52 p.m.