findPeaksPerGene: Find peaks per gene

findPeaksPerGeneR Documentation

Find peaks per gene

Description

For finding the peaks (stall sites) per gene, with some default filters. A peak is basically a position of very high coverage compared to its surrounding area, as measured using zscore.

Usage

findPeaksPerGene(
  tx,
  reads,
  top_tx = 0.5,
  min_reads_per_tx = 20,
  min_reads_per_peak = 10,
  type = "max",
  gene_ids = names(tx),
  coverage = coveragePerTiling(tx, reads, TRUE, as.data.table = TRUE)
)

Arguments

tx

a GRangesList

reads

a GAlignments or GRanges, must be 1 width reads like p-shifts, or other reads that is single positioned. It will work with non 1 width bases, but you then get larger areas for peaks.

top_tx

numeric, default 0.50 (only use 50% top transcripts by read counts).

min_reads_per_tx

numeric, default 20. Gene must have at least 20 reads, applied before type filter.

min_reads_per_peak

numeric, default 10. Peak must have at least 10 reads.

type

character, default "max". Get only max peak per gene. Alternatives: "all", all peaks passing the input filter will be returned. "median", only peaks that is higher than the median of all peaks. "maxmedian": get first "max", then median of those.

gene_ids

character vector, names of genes, default names(tx)

coverage

a data.table of coverage, with columns position, score and genes

Details

For more details see reference, which uses a slightly different method by zscore of a sliding window instead of over the whole tx.

Value

a data.table of gene_id, position, counts of the peak, zscore and standard deviation of the peak compared to rest of gene area.

References

doi: 10.1261/rna.065235.117

Examples

df <- ORFik.template.experiment()
cds <- loadRegion(df, "cds")
# Load ribo seq from ORFik
rfp <- fimport(df[3,]$filepath)
# All transcripts passing filter
findPeaksPerGene(cds, rfp, top_tx = 0)
# Top 50% of genes
findPeaksPerGene(cds, rfp)

JokingHero/ORFik documentation built on Dec. 21, 2024, 12:01 a.m.