motifcounter-package: TFBSs analysis in DNA sequences
In motifcounter: R package for analysing TFBSs in DNA sequences

Description Details Author(s) Examples

The package provides functions for determining the positions of motif hits as well as motif hit enrichment for a given position frequency matrix (PFM) in a DNA sequence of interest. The following examples guides you through the main functions of the 'motifcounter' package.

For an analysis with 'motifcounter', the user is required to provide 1) a PFM, 2) a DNA sequence which is used to estimate a background model (see link{readBackground}), 3) a DNA sequence of interest that shall be scanned for motif hits (can be the same as the one used for point 2), and 4) (optionally) a desired false positive probability of motif hits in random DNA sequences (see motifcounterOptions).

Package:	motifcounter
Type:	Package
Version:	1.0
Date:	2016-11-04
License:	GPL-2

Wolfgang Kopp

Maintainer: Wolfgang Kopp <kopp@molgen.mpg.de>

# Load sequences
file = system.file("extdata", "seq.fasta", package = "motifcounter")
seqs = Biostrings::readDNAStringSet(file)

# Estimate an order-1 background model
order = 1
bg = readBackground(seqs, order)

# Load motif
motiffile = system.file("extdata", "x31.tab", package = "motifcounter")
motif = t(as.matrix(read.table(motiffile)))

# Normalize the motif
# Normalization is sometimes necessary to prevent zeros in
# the motif
motif = normalizeMotif(motif)

# Use subset of the sequences
seqs = seqs[1:10]

# Optionally, set the false positive probability
#alpha=0.001 # is also the default
#motifcounterOptions(alpha) 

# Investigate the per-position and per-strand scores in a given sequence
scores = scoreSequence(seqs[[1]], motif, bg)

# Investigate the per-position and per-strand motif hits in a given sequence
hits = motifHits(seqs[[1]], motif, bg)

# Determine the average score profile across a set of sequences
scores = scoreProfile(seqs, motif, bg)

# Determine the average motif hit profile across a set of sequences
hits = motifHitProfile(seqs, motif, bg)

# Determine the empirical score distribution
scoreHistogram(seqs, motif, bg)

# Determine the theoretical score distribution in random sequences
scoreDist(motif, bg)


# Determine the motif hit enrichment in a set of DNA sequences
# 1. Use the compound Poisson approximation
#    and scan only a single strand for motif hits
result = motifEnrichment(seqs, motif, bg,
            singlestranded = TRUE, method = "compound")

# Determine the motif hit enrichment in a set of DNA sequences
# 2. Use the compound Poisson approximation
#    and scan both strands for motif hits
result = motifEnrichment(seqs, motif, bg,
            singlestranded = FALSE, method = "compound")

# Determine the motif hit enrichment in a set of DNA sequences
# 3. Use the combinatorial model
#    and scan both strands for motif hits
result = motifEnrichment(seqs, motif, bg, singlestranded = FALSE,
            method = "combinatorial")