DUTest: Apply DEXSeq to detect differential peak usage

View source: R/differential_usage.R

DUTestR Documentation

Apply DEXSeq to detect differential peak usage

Description

Apply DEXSeq to detect differential peak usage been select populations. Works by building a 'pseudo-bulk' profile of cell populations by aggregating counts from individual cells into a smaller number of profiles, defined by num.splits.

Usage

DUTest(
  peaks.object,
  population.1 = NULL,
  population.2 = NULL,
  exp.thresh = 0.1,
  fc.thresh = 0.25,
  adj.pval.thresh = 0.05,
  num.splits = 6,
  seed.use = 1,
  feature.type = c("UTR3", "exon"),
  replicates.1 = NULL,
  replicates.2 = NULL,
  include.annotations = FALSE,
  filter.pA.stretch = FALSE,
  verbose = TRUE,
  do.MAPlot = FALSE,
  return.dexseq.res = FALSE,
  ncores = 1
)

Arguments

peaks.object

Either a Seurat or SCE object of peaks

population.1

a target population of cells (can be an ID/cluster label or a set of cell barcode IDs)

population.2

comparison population of cells. If NULL (default), uses all non-population.1 cells

exp.thresh

minimum percent expression threshold (for a population of cells) to include a peak

fc.thresh

threshold for log2 fold-change difference for returned results

adj.pval.thresh

threshold for adjusted P-value for returned results

num.splits

the number of pseudo-bulk profiles to create per identity class (default: 6)

seed.use

seed to set the randomised assignment of cells to pseudo-bulk profiles

feature.type

genomic feature types to run analysis on (default: UTR3, exon)

replicates.1

an optional list to define the cells used as replicates for population.1. Will override anything set for the population.1 parameter.

replicates.2

an optional list to define the cells used as replicates for population.2. Will override anything set for the population.2 parameter.

include.annotations

whether to include junction, polyA motif and stretch annotations in output (default: FALSE)

filter.pA.stretch

whether to filter out peaks annotated as proximal to an A-rich region (default: FALSE)

verbose

whether to print outputs (TRUE by default)

do.MAPlot

make an MA plot of results (FALSE by default)

return.dexseq.res

return the raw and unfiltered DEXSeq results object (FALSE by default)

ncores

number of cores to run DEXSeq with

Value

The results are returned as a DataFrame where each row corresponds to a peak coordinate. The default table contains the following columns: gene_name, genomic_feature(s), population1_pct, population2_pct, pvalue, padj and Log2_fold_change. genomic_feature(s) indicates the genomic feature type(s) that the peak overlaps. population1_pct and population2_pct indicate the percentage of cell expressing the peak in the target and comparison population of cells, respectively. The pvalue, padj and Log2_fold_change values are derived from the results table returned by the DEXSeq::DEXSeqResults function.

Examples




extdata_path <- system.file("extdata",package = "Sierra")
load(paste0(extdata_path,"/TIP_cell_info.RData"))
## Not run: 
peak.annotations <- read.table("TIP_merged_peak_annotations.txt", header = TRUE,sep = "\t",
                                      row.names = 1,stringsAsFactors = FALSE)
peaks.seurat <- NewPeakSeurat(peak.data = peak.counts, 
                             annot.info = peak.annotations, 
                             cell.idents = tip.populations,
                             tsne.coords = tip.tsne.coordinates,
                             min.cells = 0, min.peaks = 0)

res.table = DUTest(peaks.seurat, population.1 = "F-SL", population.2 = "EC1",
                         exp.thresh = 0.1,  feature.type = c("UTR3", "exon"))

## End(Not run)


VCCRI/Sierra documentation built on July 3, 2023, 6:39 a.m.