combinatorialDist: Combinatrial model approximation of the number of motif hits

Description Usage Arguments Details Value See Also Examples

View source: R/combinatorial_wrapper.R

Description

This function approxmiates the distribution of the number of motif hits. To this end, it sums over all combinations of obtaining k hits in a random sequence of a given length using an efficient dynamic programming algorithm.

Usage

1
combinatorialDist(seqlen, overlap)

Arguments

seqlen

Integer-valued vector that defines the lengths of the individual sequences. For a given DNAStringSet, this information can be retrieved using numMotifHits.

overlap

An Overlap object.

Details

This function is an alternative to compoundPoissonDist which requires fixed-length sequences and currently only supports the computation of the distribution of the number of hits when both DNA strands are scanned for motif hits.

Value

List containing

dist

Distribution of the number of hits

See Also

compoundPoissonDist

numMotifHits

probOverlapHit

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Load sequences
seqfile = system.file("extdata", "seq.fasta", package = "motifcounter")
seqs = Biostrings::readDNAStringSet(seqfile)

# Load motif
motiffile = system.file("extdata", "x31.tab", package = "motifcounter")
motif = t(as.matrix(read.table(motiffile)))

# Load background model
bg = readBackground(seqs, 1)

# Compute overlap probabilities
op = motifcounter:::probOverlapHit(motif, bg, singlestranded = FALSE)

# Use 2 sequences of length 100 bp each
seqlen = rep(100, 2) 

# Computes the combinatorial distribution of the number of motif hits
dist = motifcounter:::combinatorialDist(seqlen, op)

motifcounter documentation built on Nov. 8, 2020, 5:44 p.m.