FindPeaks | R Documentation |
Takes as input a BAM file produced from barcoded scRNA-seq experiment, the reference (GTF) file used during alignment and a BED file of junctions produced by regtools. For each gene in the reference file, the peak calling process first splits the read coverage into 'across junction' and 'no junction' subsets. Within each subset, the site of maximum coverage is identified and a peak called, by fitting a Gaussian to the read coverage, from a 600bp window around this region. After calling a peak, the local read coverage is removed and the next site of maximum coverage is identified. This process runs iteratively until at least one of two stopping criteria are reached. The first criteria is defined as the maximum read coverage a minimum cutoff (min.cov.cutoff) and proportion (min.cov.prop). The second critera is the size of the peak, including a absolute threshold (min.peak.cutoff) and a relative threshold (min.peak.prop).
FindPeaks(
output.file,
gtf.file,
bamfile,
junctions.file,
min.jcutoff = 50,
min.jcutoff.prop = 0.05,
min.cov.cutoff = 500,
min.cov.prop = 0.05,
min.peak.cutoff = 200,
min.peak.prop = 0.05,
ncores = 1,
chr.names = NULL,
filter.chr = FALSE,
gene.symbol.ref = "gene_name",
fit.method = "NLS"
)
output.file |
a file containing polyA sites |
gtf.file |
reference (GTF) file |
bamfile |
scRNA-seq BAM file |
junctions.file |
BED file (as produced by regtools) or SJ.out.tab file (STAR aligner) containing splice junction coordinates |
min.jcutoff |
minimum number of spliced reads across a junction for it to be considered (default: 50). |
min.jcutoff.prop |
minimum proportion of junction reads out of all junction reads for that gene (default: 0.05) |
min.cov.cutoff |
minimum number of reads to consider a peak (default: 500) |
min.cov.prop |
minimum proportion of reads to consider a peak (default: 0.05) |
min.peak.cutoff |
minimum peak height (default: 200) |
min.peak.prop |
minimum ratio of current peak height relative to maximum peak height for this gene (default: 0.05) |
ncores |
number of cores to use |
chr.names |
names of chromosomes |
filter.chr |
names of chromosomes to filter |
gene.symbol.ref |
field in the GTF file containing the gene symbol |
NULL. Writes counts to file.
extdata_path <- system.file("extdata",package = "Sierra")
reference.file <- paste0(extdata_path,"/Vignette_cellranger_genes_subset.gtf")
junctions.file <- paste0(extdata_path,"/Vignette_example_TIP_sham_junctions.bed")
bamfile <- c(paste0(extdata_path,"/Vignette_example_TIP_sham.bam"),
paste0(extdata_path,"/Vignette_example_TIP_mi.bam") )
peak.output.file <- c("Vignette_example_TIP_sham_peaks.txt", "Vignette_example_TIP_MI_peaks.txt")
FindPeaks(output.file=peak.output.file[1], gtf.file = reference.file,
bamfile=bamfile[1], junctions.file=junctions.file)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.