bootstrap-signal-by-position | R Documentation |
These functions perform bootstrap subsampling of mean readcounts at different
positions within regions of interest (metaSubsample
), or, in the more
general case of metaSubsampleMatrix
, column means of a matrix are
bootstrapped by sampling the rows. Mean signal counts can be calculated at
base-pair resolution, or over larger bins.
metaSubsample(
dataset.gr,
regions.gr,
binsize = 1L,
first.output.xval = 1L,
sample.name = deparse(substitute(dataset.gr)),
n.iter = 1000L,
prop.sample = 0.1,
lower = 0.125,
upper = 0.875,
field = "score",
NF = NULL,
remove.empty = FALSE,
blacklist = NULL,
zero_blacklisted = FALSE,
expand_ranges = FALSE,
ncores = getOption("mc.cores", 2L)
)
metaSubsampleMatrix(
counts.mat,
binsize = 1L,
first.output.xval = 1L,
sample.name = NULL,
n.iter = 1000L,
prop.sample = 0.1,
lower = 0.125,
upper = 0.875,
NF = 1L,
remove.empty = FALSE,
ncores = getOption("mc.cores", 2L)
)
dataset.gr |
A GRanges object in which signal is contained in metadata
(typically in the |
regions.gr |
A GRanges object containing intervals over which to metaplot. All ranges must have the same width. |
binsize |
The size of bin (in basepairs, or number of columns for
|
first.output.xval |
The relative start position of the first bin, e.g.
if |
sample.name |
Defaults to the name of the input dataset. This is
included in the output as a convenience, as it allows row-binding outputs
from different samples. If |
n.iter |
Number of random subsampling iterations to perform. Default is
|
prop.sample |
The proportion of the ranges in |
lower , upper |
The lower and upper quantiles of subsampled signal means
to return. The defaults, |
field |
One or more metadata fields of |
NF |
An optional normalization factor by which to multiply the counts.
If given, |
remove.empty |
A logical indicating whether regions
( |
blacklist |
An optional GRanges object containing regions that should be excluded from signal counting. |
zero_blacklisted |
When set to |
expand_ranges |
Logical indicating if ranges in |
ncores |
Number of cores to use for computations. |
counts.mat |
A matrix over which to bootstrap column means by subsampling its rows. Typically, a matrix of readcounts with rows for genes and columns for positions within those genes. |
A dataframe containing x-values, means, lower quantiles, upper quantiles, and the sample name (as a convenience for row-binding multiple of these dataframes). If a list of GRanges is given as input, or if multiple fields are given, a single, combined dataframe is returned containing data for all fields/datasets.
Mike DeBerardine
getCountsByPositions
data("PROseq") # import included PROseq data
data("txs_dm6_chr4") # import included transcripts
# for each transcript, use promoter-proximal region from TSS to +100
pr <- promoters(txs_dm6_chr4, 0, 100)
#--------------------------------------------------#
# Bootstrap average signal in each 5 bp bin across all transcripts,
# and get confidence bands for middle 30% of bootstrapped means
#--------------------------------------------------#
set.seed(11)
df <- metaSubsample(PROseq, pr, binsize = 5,
lower = 0.35, upper = 0.65,
ncores = 1)
df[1:10, ]
#--------------------------------------------------#
# Plot bootstrapped means with confidence intervals
#--------------------------------------------------#
plot(mean ~ x, df, type = "l", main = "PROseq Signal",
ylab = "Mean + 30% CI", xlab = "Distance from TSS")
polygon(c(df$x, rev(df$x)), c(df$lower, rev(df$upper)),
col = adjustcolor("black", 0.1), border = FALSE)
#==================================================#
# Using a matrix as input
#==================================================#
# generate a matrix of counts in each region
countsmat <- getCountsByPositions(PROseq, pr)
dim(countsmat)
#--------------------------------------------------#
# bootstrap average signal in 10 bp bins across all transcripts
#--------------------------------------------------#
set.seed(11)
df <- metaSubsampleMatrix(countsmat, binsize = 10,
sample.name = "PROseq",
ncores = 1)
df[1:10, ]
#--------------------------------------------------#
# the same, using a normalization factor, and changing the x-values
#--------------------------------------------------#
set.seed(11)
df <- metaSubsampleMatrix(countsmat, binsize = 10,
first.output.xval = 0, NF = 0.75,
sample.name = "PROseq", ncores = 1)
df[1:10, ]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.