CallPeaks.oneRep: m6A peak calling with only one replicate.

View source: R/CallPeaks.oneRep.R

CallPeaks.oneRepR Documentation

m6A peak calling with only one replicate.

Description

This function conducts peak calling for data when there is only one biological replicate of input and IP sample.

Usage

CallPeaks.oneRep(Counts, bins, sf = NULL,
                 WhichThreshold = "fdr_lfc",
                 pval.cutoff = 1e-05, fdr.cutoff = 0.05,
                 lfc.cutoff = 0.7, windlen = 5, lowCount = 10)

Arguments

Counts

A two-column data matrix containing bin-level read counts for both IP and input samples.

sf

A numerical vector containg size factors of both IP and input samples. It can be provided by the user, or automatically estimated using "Counts". Default is NULL.

bins

A dataframe containing the genomic locations (chr, start, end, strand) of each bin.

WhichThreshold

A character specifying a criterion to select significant bins in bump finding using an ad hoc algorithm. There are five options: "pval" (only use p-values), "fdr" (only use FDR), "lfc" (only use log fold change), "pval_lfc" (use both p-values and log fold changes) and "fdr_lfc" (use FDR and log fold changes). Default is "fdr_lfc".

pval.cutoff

A constant indicating a cutoff for p-value. Default is 1e-05.

fdr.cutoff

A constant indicating a cutoff for FDR. Default is 0.05.

lfc.cutoff

A constant indicating a cutoff for log fold change. Default is 0.7 for fold change of 2.

windlen

An integer specifying the length of consecutive bins used in simple moving average smooth of log fold change. Default is 5.

lowCount

An integer to filter out m6A regions with lower read counts. Default is 10.

Details

When there is only one replicate, TRESS assigns a p-value for each bin based on the binomial test. Then it calls candidates with the same algorithm used when there are multiple biological replicates. Binomal tests are performed one more time to select significant candidates as final list of peaks.

Value

It returns an excel containing the information for each peak:

chr

Chromosome number of each peak.

start

The start of genomic position of each peak.

end

The end of genomic position of each peak.

strand

The strand of each peak.

summit

The summit of each peak.

pvals

P-value for each peak calculated based on binomial test.

p.adj

Adjusted p-values using Benjamini-Hochberg procedure.

lg.fc

Log fold change between normalized IP and normalized input read counts.

Note, there are additional columns with name "*.bam". These columns contain the read counts from IP and input samples.

Examples

## A toy example
data("Basal")
peaks = CallPeaks.oneRep(
    Counts = Basal$Bins$Counts,
    sf = Basal$Bins$sf,
    bins = Basal$Bins$Bins
    )
head(peaks, 3)

haowulab/TRESS documentation built on Aug. 27, 2022, 7:11 p.m.