Transcription factor binding site (TFBS) sequence patterns are often characterized by a position-nucleotide weight matrix (PWM) because it can be estimated from a small number of representative TFBS sequences. However, because the PWM probability model assumes independence between individual positions within the binding site, the PWMs for some TFs are poor discriminants of TFBS sequences from non-binding-site, noncoding DNA. Since three-dimensional DNA structure is recognized by TFs and is a determinant of binding specificity that depends on multi-base patterns, we developed a weak classifier, based on DNA shape parameter features extracted from DNA sequence, for predicting whether an oligonucleotide sequence (of variable length but at least six bp) is or is not within a cis-regulatory element (specifically, a transcription factor binding site). This classifier's predictions are returned as voting fraction scores that range from 0 to 1. The voting fraction scores can be used in tandem with a standard PWM score (such as can be obtained using the R package TFBSTools) as described in the accompanying paper "A DNA shape-based regulatory prior improves position-weight matrix-based recognition of transcription factor binding sites" (Yang and Ramsey, Dec. 2014).
Package details |
|
---|---|
Author | Jichen Yang, Stephen Ramsey |
Maintainer | Stephen Ramsey (lab.saramsey.org) <PLEASE_GET_MY_EMAIL_ADDRESS_AT_MY_WEBSITE@nowhere.com> |
License | file LICENSE |
Version | 1.0 |
URL | http://lab.saramsey.org http://github.com/ramseylab/regshape |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.