Description Usage Arguments Details Value Author(s) See Also Examples
This is the main function of the WaveSeq algorithm modules implemented in R. There
are six processing steps:
(1) Pre-process the input BED file.
(2) Perform Monte Carlo sampling to estimate wavelet coefficient thresholds using the
continuous wavelet transform (CWT).
(3) Apply thresholds to CWT of input data to obtain putative peaks.
(4) Concatenate peaks within a user-specified distance (optional)
(5) If control data is specified, compare reads within peaks to control with an exact binomial test.
(5) If control data is absent, use randomized algorithm to estimate background distribution and
assign p-values to peaks.
(6) Correct p-values for multiple-testing.
1 2 3 4 | waveseq(chip, control=NA, outdir, exptname, preprocess = TRUE, redundancy = TRUE, thresdist = TRUE,
peak.calling = TRUE, mother = "morlet", winsize = 200, fragsize = 150, gap = 0, minreads = 8,
samplesize = 1e6, adj.method = "fdr", p.thres = 0.2, maxscale = 12, N = 5000, minsig = 2,
minsigscale = 3, maxsigscale = -1, p.min = 0.001, p.max = 0.3, binom.sided = "two.sided")
|
chip |
Input file in BED format. If |
control |
Padded bedGraph file for the control data being analyzed (default = NA). |
outdir |
Output directory (required). |
exptname |
Unique string to identify name of experiment (required). |
preprocess |
Logical value to indicate whether input file should be preprocessed or not. If missing or |
redundancy |
Logical value to indicate whether redundant reads should be removed before calculating summary counts (default=TRUE). |
thresdist |
Logical value to indicate whether threshold estimation step is to be performed or not. If missing or
|
peak.calling |
Logical value to indicate whether peak-calling step is to be performed or not (default=TRUE). |
mother |
Wavelet mother function used for the CWT. Tested choices are "morlet","haar","gaussian1", "gaussian2" (default="morlet"). |
winsize |
Window size for padded graph files (default=200). |
fragsize |
Library size used for sequencing (default=150). |
gap |
Maximum number of non-significant windows separating peaks that are to be concatenated (default = 0). |
minreads |
Minimum reads within a peak (default = 8). This filtering is necessary for a more accurate estimation of FDR. |
samplesize |
Number of peaks to sample for building ecdf with the randomized algorithm (default = 1e6). |
adj.method |
Method for multiple testing adjustment. Can use any methods accepted by |
p.thres |
Threshold p-value for calling significant windows (default=0.2) |
maxscale |
Maximum scale of wavelet decomposition. Also defines size of sample (2^maxscale) (default=12). |
N |
Number of samples (default=5000) |
minsig |
Minimum number of significant scales for a window to be considered significant (default=2). |
minsigscale |
Minimum significant scale considered for peak detection (default=3). |
maxsigscale |
Minimum significant scale considered for peak detection (default=-1). |
p.min |
Minimum quantile (1-p.min) to be output for wavelet coefficient distribution (default=0.3). |
p.max |
Maximum quantile (1-p.max) as above (default=0.001). |
binom.sided |
One-sided or two-sided binomial test, if control is not NA (default = "two.sided") |
This is the main function to apply the WaveSeq algorithm to Next-generation sequencing data e.g. ChIP-Seq.
This function concatenates peaks within a specified distance, estimates FDR in the presence or absence of control data and writes peaks to file. This file has the following fields:
chromosome |
Chromosome name |
start |
Start position of peak (1-based indexing) |
end |
End position of peak |
ChIP.reads |
Number of ChIP reads in the peak (normalized to per
million mapped reads if |
Control.reads |
Number of control reads in the peak (normalized to per
million mapped reads); Output only if |
Fold.Change |
Fold-change calculated as |
p.value |
P-value for the peak |
adj.p.value |
P-values adjusted for multiple comparisons using |
Apratim Mitra
sample
, ecdf
, p.adjust
, binom.test
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | ## Not run:
# load package
library(WaveSeqR)
# get paths to data files
WS.path <- system.file(package="WaveSeqR")
chip.file <- file.path(WS.path,"inst","extdata","gabp_valouev2008_chr22-pad.graph")
control.file <- file.path(WS.path,"inst","extdata","rxnoip_valouev2008_chr22-pad.graph")
test.dir <- file.path(WS.path,"extdata")
# ChIP-Seq analysis with control
waveseq(chip=chip.file,
outdir=test.dir,
control=control.file,
exptname="GABP_valouev_chr22_control",
preprocess=FALSE,
thresdist=FALSE,
gap=0,
winsize=200,
minreads=8,
adj.method="fdr")
# ChIP-Seq analysis without control
waveseq(chip=chip.file,
outdir=test.dir,
exptname="GABP_valouev_chr22_nocontrol",
preprocess=FALSE,
thresdist=FALSE,
gap=0,
winsize=200,
minreads=8,
samplesize=1e6,
adj.method="fdr")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.