This function approximates the distribution of the clump sizes.

clumpSizeDist(maxclump, overlap, method = "kopp")
`maxclump` |
Maximal clump size |

`overlap` |
An Overlap object. |

`method` |
String that defines which method shall be invoked: 'pape' or 'kopp' (see description). Default: method = 'kopp'. |

The clump size distribution can be determined in two alternative ways:

A re-implemented version of the algorithm that was described in Pape et al.

*Compound poisson approximation of the number of occurrences of a position frequency matrix (PFM) on both strands.*2008 can be invoked using method='pape'.An improved approximation of the clump size distribution uses more appropriate statistical assumptions concerning overlapping motif hits and that can be used with order-d background models as well. The improved version is used by default with method='kopp'.

List containing

- dist
Distribution of the clump size

# Load sequences
seqfile = system.file("extdata", "seq.fasta", package = "motifcounter")
seqs = Biostrings::readDNAStringSet(seqfile)
# Load motif
motiffile = system.file("extdata", "x31.tab", package = "motifcounter")
motif = t(as.matrix(read.table(motiffile)))
# Load background model
bg = readBackground(seqs, 1)
# Use 100 individual sequences of length 150 bp each
seqlen = rep(150, 100)
# Compute overlapping probabilities
# for scanning the forward DNA strand only
op = motifcounter:::probOverlapHit(motif, bg, singlestranded = FALSE)
# Computes the compound Poisson distribution
dist = motifcounter:::clumpSizeDist(20, op)
