DPDE4PM: DPDE4PM merged RNA-seq peaks called on different samples

Description Usage Arguments Value

View source: R/DPDE4PM.R

Description

DPDE4PM merged RNA-seq peaks called on different samples

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
DPDE4PM(
  GENE,
  PEAKS,
  GTF = NULL,
  ANNOTATION = NULL,
  RESOLUTION = 50,
  DP.ITERATIONS = 1000,
  WEIGHT.THRESHOLD = 0.2,
  N.SD = 1,
  OUTPUTDIR = ".",
  PLOT.RESULT = F,
  WRITE.OUTPUT = T,
  OUTPUT.TAG = "",
  ALPHA.PRIORS = c(1, 2),
  SEED = 123
)

Arguments

GENE

A 'character' gene id corresponding to the gene_id's found in the GTF files and the 'name' column in the peak files

PEAKS

A data frame containing the following columns, and potentially extras, usually found in a BED12 file, base 0 system

chr

chromosomes, same as in GTF file

start

starting position of the peak, base 0

end

end position of the peak, base 0

name

gene id

score

p-value associated with the peak

strand

strand of the gene

blockCount

number of segments in the peak

blockSizes

size of segments in the peak, BED12 notation

blockStarts

starting positions of segments, BED12 notation

sample

sample_id of samples

GTF

The GTF file used to generate the peaks. This is used to determine the genomic coordinates of the gene.

ANNOTATION

A object created by the read.gtf function. This allows the user to provide an annotation object to save compute time during parallelization.

RESOLUTION

The width (bps) used to sample points from the peaks. This is likely optimized by choosing the window size used to generate the peaks.

DP.ITERATIONS

Number of iterations used to fit the Dirichlet Process

WEIGHT.THRESHOLD

A proportion (out of 1) used to determine the percentage of data accounted for by the GMM to be called a peak

N.SD

Number of standard deviations from the mean for each fitted Gaussian that should considered part of the joint peak

OUTPUTDIR

Output directory

PLOT.RESULT

TRUE or FALSE, whether plots should be generated

WRITE.OUTPUT

TRUE or FALSE, whether an output file should be saved

OUTPUT.TAG

A character string indicating a tag to track the generated files

ALPHA.PRIORS

A length 2 numeric vector indicating alpha, beta of the Gamma distribution from which the concentration weight parameter alpha is drawn, see the R package dirichletprocess

SEED

a seed for reproducibility

Value

A dataframe with BED12 columns and additional columns for each sample in the PEAKS dataframe and associated p-value with that peak.


helen-zhu/DPDE4PM documentation built on Feb. 17, 2021, 9:46 a.m.