MergePeakCoordinates: Merge peaks across a list of data-sets

View source: R/dataset_merging.R

MergePeakCoordinatesR Documentation

Merge peaks across a list of data-sets

Description

Takes as input a list of named peaks obtained from running peak calling on multiple data-sets. First goes through each peak set and check what peaks within each set should be merged (self-merging). Merging is based on similarity criteria set by sim.thresh and allow.match.var. Then compares each peak set as a reference to the remaining sets to identify peaks that should be merged. Returns a list of peaks that have been merged, as well as the unique peaks from each data-set.

Usage

MergePeakCoordinates(
  peak.dataset.table,
  output.file,
  sim.thresh = 0.75,
  allow.match.var = 0.25,
  ncores = 1
)

Arguments

peak.dataset.table

a dataframe with two required columnss: one called "Peak_file" , which contains file names of the peak data-sets to be merged and labels ("Identifier") for each file.

output.file

file to write the set of merged peaks to

sim.thresh

The required similarity threshold for merging (default: 0.75)

allow.match.var

The allowance for deviation from the sim.thresh for comparison peaks (default: 0.25)

ncores

number of cores to use (default 1)

Value

NULL. writes out a set of merged peaks to output.file

Examples

     
     
library(Sierra)
extdata_path <- system.file("extdata",package = "Sierra")
reference.file <- paste0(extdata_path,"/Vignette_cellranger_genes_subset.gtf")
junctions.file <- paste0(extdata_path,"/Vignette_example_TIP_sham_junctions.bed")
bamfile <- c(paste0(extdata_path,"/Vignette_example_TIP_sham.bam"),
            paste0(extdata_path,"/Vignette_example_TIP_mi.bam") )
whitelist.bc.file <- c(paste0(extdata_path,"/example_TIP_sham_whitelist_barcodes.tsv"),
                      paste0(extdata_path,"/example_TIP_MI_whitelist_barcodes.tsv"))

### Peak calling
peak.output.file <- c("Vignette_example_TIP_sham_peaks.txt",
                     "Vignette_example_TIP_MI_peaks.txt")
FindPeaks(output.file = peak.output.file[1],   # output filename
         gtf.file = reference.file,           # gene model as a GTF file
bamfile = bamfile[1],                # BAM alignment filename.
         junctions.file = junctions.file,     # BED filename of splice junctions exising in BAM file.
         ncores = 1)                          # number of cores to use


FindPeaks(output.file = peak.output.file[2],   # output filename
         gtf.file = reference.file,           # gene model as a GTF file
         bamfile = bamfile[2],                # BAM alignment filename.
         junctions.file = junctions.file,     # BED filename of splice junctions exising in BAM file.
         ncores = 1)

#### Peak merging
peak.dataset.table = data.frame(Peak_file = peak.output.file,
                               Identifier = c("TIP-example-Sham", "TIP-example-MI"),
                               stringsAsFactors = FALSE)

peak.merge.output.file = "TIP_merged_peaks.txt"
MergePeakCoordinates(peak.dataset.table, output.file = peak.merge.output.file, ncores = 1)
     
     
     
 

VCCRI/Sierra documentation built on July 3, 2023, 6:39 a.m.