findMotif: De-novo discovery of distriminative motifs

Description Usage Arguments Value Examples

View source: R/motif.rg.R

Description

The function searches motifs that discriminate the given foregound and background sequences.

Usage

1
2
3
4
5
6
findMotif(all.seq,  category, weights = rep(1, length(all.seq)),
start.width=6,min.cutoff=5, min.ratio=1.3,
min.frac=0.01, both.strand=TRUE, flank=2, max.motif=5,
mask=TRUE,other.data=NULL, start.nmer=NULL,
enriched.only=F,n.bootstrap = 5, bootstrap.pvalue=0.1,is.parallel =
TRUE,mc.cores = 4,min.info=10,max.width=15,discretize=TRUE)

Arguments

all.seq

DNAStringSet; foreground and background sequences.

category

numeric vector; specify which sequences are foreground (with value 1), and background (value 0).

weights

numeric vector: the weights for all sequences. Default: 1

start.width

logical; the width for enumerating seed patterns

min.cutoff

numeric; the score cutoff required for seed selection. All scores are negative, the lower the better.

min.ratio

numeric; the minimum fold change of motif occurences in foreground vs background.

min.frac

numeric; the minimum fraction of fg/bg sequences containing the candidate motifs

both.strand

logical; if true, search both strands

flank

integer; the length for step-wise pattern extension at both ends on candidate motifs

max.motif

integer; the maximum number of output motifs

mask

logical; if true, mask previous motifs when searching for the next motif

other.data

if not NULL, a matrix with additional terms for the regression model for bias adjustment

start.nmer

if not NULL, a matrix with counts for user specified seed pattern in each sequence

enriched.only

logical; if true, only predict enriched motif

n.bootstrap

integer; the number of bootstrapping tests to estimate score variance

bootstrap.pvalue

numeric: the bootstrap t.test pvalues to determine the significance of improvement

is.parallel

logical;if true, runs in parallel mode, and requires "parallel" library

mc.cores

integer; the number of CPUs for paralel run

min.info

minimal information content for the motif to prevent it from being too degenerate

max.width

maximum width of the motif for extension

discretize

logical default TRUE

Value

return a list with following elements:

motifs

a list motif descriptions of class Motif-class

.

category

input binary specification of foreground/background

mask.motifs

if mask=T, then mask.motifs contain the description of motif is based on motif matches after the input sequences being masked by previous motifs. In this case, "motifs" contained the unmasked motif descriptions.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
MD.peak.seq <- readDNAStringSet(system.file("extdata","MD.peak.fa", package="motifRG"))
MD.control.seq <- readDNAStringSet(system.file("extdata","MD.control.fa", package="motifRG"))
category <- c(rep(1, length(MD.peak.seq)), rep(0, length(MD.control.seq)))
MD.motifs <- findMotif(append(MD.peak.seq, MD.control.seq),category, max.motif=3,enriched=TRUE)

### Get summary of motifs
summaryMotif(MD.motifs$motifs, MD.motifs$category)

### plot the dinucleotide representation of the first motif
plotMotif(MD.motifs$motifs[[1]]@match$pattern)

### Create table of motifs in Latex 
motifLatexTable(MD.motifs, main="MD motifs")

### Create table of motifs in Html
motifHtmlTable(MD.motifs)

motifRG documentation built on April 28, 2020, 8:46 p.m.