BCRANK: predicting binding site consensus from ranked DNA sequences

Description

This function implements an algorithm for detection of short DNA sequences that are overrepresented in some part of the list. Starting from some initial consensus DNA sequence coded in IUPAC symbols, the method uses a heuristic search to improve the consensus until a local optimum is found. Individual predicted binding sites can be reported by the function matchingSites.

Usage

1
2
3
bcrank(fafile, startguesses=c(), restarts=10, length=10,
       reorderings=500, silent=FALSE, plot.progress=FALSE,
       do.search=TRUE, use.P1=FALSE, use.P2=TRUE, strip.desc=TRUE)

Arguments

fafile

a ranked fasta file containing DNA sequences.

startguesses

a character vector with consensus sequences in IUPAC coding to be used as starting sequences in the search. If empty, random start guesses will be generated.

restarts

number restarts of the algorithm when using random start guesses.

length

legth of random start guess.

reorderings

number of random reorderings of the DNA sequences performed when calculating score.

silent

reports progress status if FALSE.

plot.progress

if TRUE, the progress is displayed in a plot.

do.search

if FALSE, no search is performed. In that case the start guesses are assigned with scores and reported as results.

use.P1

Use penalty for bases other than A,C,G,T.

use.P2

Use penalty for motifs matching repetitive sequences.

strip.desc

Ignored (always treated as TRUE).

Value

The method returns an objcet of class BCRANKresult-class.

Author(s)

Adam Ameur, adam.ameur@genpat.uu.se

References

Ameur, A., Rada-Iglesias, A., Komorowski, J., Wadelius, C. Identification of candidate regulatory SNPs by combination of transcription factor binding site prediction, SNP genotyping and haploChIP. Nucleic Acids Res, 2009, 37(12):e85.

See Also

matchingSites, BCRANKresult-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Load example fasta file  
fastaFile <- system.file("Exfiles/USF1_small.fa", package = "BCRANK") 
## Run BCRANK
## Not run: BCRANKout <- bcrank(fastaFile, restarts=20)

## Show BCRANK results
toptable(BCRANKout)
## The top scoring result
topMotif <- toptable(BCRANKout,1)
## Plot BCRANK search path
plot(topMotif)
## Position Weight Matrix
pwm(topMotif, normalize=FALSE)