| thin_all | R Documentation |
Given a matrix of real RNA-seq counts, this function will apply a
thinning factor uniformly to every count in this matrix. This uniformly
lowers the read-depth for the entire dataset. The thinning factor should
be provided on the log2-scale. This is a specific application of the
binomial thinning approach in thin_diff. Though this particular
form of thinning was used by Robinson and Storey (2014) in the context
of deriving read-depth suggestions. It is also
described in detail in Gerard (2020).
thin_all(mat, thinlog2, type = c("thin", "mult"))
mat |
A numeric matrix of RNA-seq counts. The rows index the genes and the columns index the samples. |
thinlog2 |
A numeric scalar. This is the amount to shrink each count
in |
type |
Should we apply binomial thinning ( |
A list-like S3 object of class ThinData.
Components include some or all of the following:
matThe modified matrix of counts.
designmatThe design matrix of variables used to simulate
signal. This is made by column-binding design_fixed and the
permuted version of design_perm.
coefmatA matrix of coefficients corresponding to
designmat.
design_obsAdditional variables that should be included in
your design matrix in downstream fittings. This is made by
column-binding the vector of 1's with design_obs.
svA matrix of estimated surrogate variables. In simulation studies you would probably leave this out and estimate your own surrogate variables.
cormatA matrix of target correlations between the
surrogate variables and the permuted variables in the design matrix.
This might be different from the target_cor you input because
we pass it through fix_cor to ensure
positive semi-definiteness of the resulting covariance matrix.
matching_varA matrix of simulated variables used to
permute design_perm if the target_cor is not
NULL.
David Gerard
Gerard, D (2020). "Data-based RNA-seq simulations by binomial thinning." BMC Bioinformatics. 21(1), 206. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1186/s12859-020-3450-9")}.
Robinson, David G., and John D. Storey. "subSeq: determining appropriate sequencing depth through efficient read subsampling." Bioinformatics 30, no. 23 (2014): 3424-3426. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btu552")}.
select_countsFor subsampling the rows and columns of your real RNA-seq count matrix prior to applying binomial thinning.
thin_diffFor the more general thinning approach.
thin_libFor thinning sample-wise.
thin_geneFor thinning gene-wise.
ThinDataToSummarizedExperimentFor converting a ThinData object to a SummarizedExperiment object.
ThinDataToDESeqDataSetFor converting a ThinData object to a DESeqDataSet object.
## Generate count data and set thinning factor
## In practice, you would obtain mat from a real dataset, not simulate it.
set.seed(1)
n <- 10
p <- 1000
lambda <- 1000
mat <- matrix(lambda, ncol = n, nrow = p)
thinlog2 <- 1
## Thin read-depths
thout <- thin_all(mat = mat, thinlog2 = thinlog2)
## Compare empirical and theoretical proportions
mean(thout$mat) / lambda
2 ^ -thinlog2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.