View source: R/signal_counting.R
getCountsByPositions | R Documentation |
Get the sum of the signal in dataset.gr
that overlaps each position
within each range in regions.gr
. If binning is used (i.e. positions
are wider than 1 bp), any function can be used to summarize the signal
overlapping each bin. For a description of the critical difference between
expand_ranges = FALSE
and expand_ranges = TRUE
, see
getCountsByRegions
.
getCountsByPositions(
dataset.gr,
regions.gr,
binsize = 1L,
FUN = sum,
simplify.multi.widths = c("error", "list", "pad 0", "pad NA"),
field = "score",
NF = NULL,
blacklist = NULL,
NA_blacklisted = FALSE,
melt = FALSE,
expand_ranges = FALSE,
ncores = getOption("mc.cores", 2L)
)
dataset.gr |
A GRanges object in which signal is contained in metadata (typically in the "score" field), or a named list of such GRanges objects. |
regions.gr |
A GRanges object containing regions of interest. |
binsize |
Size of bins (in bp) to use for counting within each range of
|
FUN |
If |
simplify.multi.widths |
A string indicating the output format if the
ranges in |
field |
The metadata field of |
NF |
An optional normalization factor by which to multiply the counts.
If given, |
blacklist |
An optional GRanges object containing regions that should be excluded from signal counting. |
NA_blacklisted |
A logical indicating if NA values should be returned
for blacklisted regions. By default, signal in the blacklisted sites is
ignored, i.e. the reads are excluded. If |
melt |
A logical indicating if the count matrices should be melted. If
set to |
expand_ranges |
Logical indicating if ranges in |
ncores |
Multiple cores will only be used if |
If the widths of all ranges in regions.gr
are equal, a matrix
is returned that contains a row for each region of interest, and a column
for each position (each base if binsize = 1
) within each region. If
dataset.gr
is a list, a parallel list is returned containing a
matrix for each input dataset.
If the input
regions.gr
contains ranges of varying widths, setting
simplify.multi.widths = "list"
will output a list of variable-length
vectors, with each vector corresponding to an individual input region. If
simplify.multi.widths = "pad 0"
or "pad NA"
, the output is a
matrix containing a row for each range in regions.gr
, but the number
of columns is determined by the largest range in regions.gr
. For
each region of interest, columns that correspond to positions outside of
the input range are set, depending on the argument, to 0
or
NA
.
Mike DeBerardine
getCountsByRegions
,
metaSubsample
data("PROseq") # load included PROseq data
data("txs_dm6_chr4") # load included transcripts
#--------------------------------------------------#
# counts from 0 to 50 bp after the TSS
#--------------------------------------------------#
txs_pr <- promoters(txs_dm6_chr4, 0, 50) # first 50 bases
countsmat <- getCountsByPositions(PROseq, txs_pr)
countsmat[10:15, 41:50] # show only 41-50 bp after TSS
#--------------------------------------------------#
# redo with 10 bp bins from 0 to 100
#--------------------------------------------------#
# column 5 is sums of rows shown above
txs_pr <- promoters(txs_dm6_chr4, 0, 100)
countsmat <- getCountsByPositions(PROseq, txs_pr, binsize = 10)
countsmat[10:15, ]
#--------------------------------------------------#
# same as the above, but with the average signal in each bin
#--------------------------------------------------#
countsmat <- getCountsByPositions(PROseq, txs_pr, binsize = 10, FUN = mean)
countsmat[10:15, ]
#--------------------------------------------------#
# standard deviation of signal in each bin
#--------------------------------------------------#
countsmat <- getCountsByPositions(PROseq, txs_pr, binsize = 10, FUN = sd)
round(countsmat[10:15, ], 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.