getSpikeInCounts: Filtering and counting spike-in reads

View source: R/spikein_functions.R

getSpikeInCountsR Documentation

Filtering and counting spike-in reads

Description

Filtering and counting spike-in reads

Usage

getSpikeInCounts(
  dataset.gr,
  si_pattern = NULL,
  si_names = NULL,
  field = "score",
  sample_names = NULL,
  expand_ranges = FALSE,
  ncores = getOption("mc.cores", 2L)
)

removeSpikeInReads(
  dataset.gr,
  si_pattern = NULL,
  si_names = NULL,
  field = "score",
  ncores = getOption("mc.cores", 2L)
)

getSpikeInReads(
  dataset.gr,
  si_pattern = NULL,
  si_names = NULL,
  field = "score",
  ncores = getOption("mc.cores", 2L)
)

Arguments

dataset.gr

A GRanges object or a list of GRanges objects.

si_pattern

A regular expression that matches spike-in chromosomes. Can be used in addition to, or as an alternative to si_names.

si_names

A character vector giving the names of the spike-in chromosomes. Can be used in addition to, or as an alternative to si_pattern.

field

The metadata field in dataset.gr that contains readcounts. If each range is an individual read, set field = NULL.

sample_names

An optional character vector used to rename the datasets in dataset.gr

expand_ranges

Logical indicating if ranges in dataset.gr should be treated as descriptions of single molecules (FALSE), or if ranges should be treated as representing multiple adjacent positions with the same signal (TRUE). See getCountsByRegions.

ncores

The number of cores to use for computations.

Value

A dataframe containing total readcounts, experimental (non-spike-in) readcounts, and spike-in readcounts.

Author(s)

Mike DeBerardine

Examples

#--------------------------------------------------#
# Make list of dummy GRanges
#--------------------------------------------------#

gr1_rep1 <- GRanges(seqnames = c("chr1", "chr2", "spikechr1", "spikechr2"),
                    ranges = IRanges(start = 1:4, width = 1),
                    strand = "+")
gr2_rep2 <- gr2_rep1 <- gr1_rep2 <- gr1_rep1

# set readcounts
score(gr1_rep1) <- c(1, 1, 1, 1) # 2 exp + 2 spike = 4 total
score(gr2_rep1) <- c(2, 2, 1, 1) # 4 exp + 2 spike = 6 total
score(gr1_rep2) <- c(1, 1, 2, 1) # 2 exp + 3 spike = 5 total
score(gr2_rep2) <- c(4, 4, 2, 2) # 8 exp + 4 spike = 12 total

grl <- list(gr1_rep1, gr2_rep1,
            gr1_rep2, gr2_rep2)

names(grl) <- c("gr1_rep1", "gr2_rep1",
                "gr1_rep2", "gr2_rep2")

grl

#--------------------------------------------------#
# Count spike-in reads
#--------------------------------------------------#

# by giving names of all spike-in chromosomes
getSpikeInCounts(grl, si_names = c("spikechr1", "spikechr2"), ncores = 1)

# or by matching the string/regular expression "spike" in chromosome names
getSpikeInCounts(grl, si_pattern = "spike", ncores = 1)

#--------------------------------------------------#
# Filter out spike-in reads
#--------------------------------------------------#

removeSpikeInReads(grl, si_pattern = "spike", ncores = 1)

#--------------------------------------------------#
# Return spike-in reads
#--------------------------------------------------#

getSpikeInReads(grl, si_pattern = "spike", ncores = 1)

mdeber/BRGenomics documentation built on Aug. 3, 2024, 3:43 a.m.