bayNorm | R Documentation |
This is the main wrapper function for bayNorm. The input is a matrix of raw scRNA-seq data and a vector of capture efficiencies of cells. You can also specify the condition of cells for normalizing multiple groups of cells separately.
bayNorm( Data, BETA_vec = NULL, Conditions = NULL, UMI_sffl = NULL, Prior_type = NULL, mode_version = FALSE, mean_version = FALSE, S = 20, parallel = TRUE, NCores = 5, FIX_MU = TRUE, GR = FALSE, BB_SIZE = TRUE, verbose = TRUE, out.sparse = FALSE )
Data |
A matrix of single-cell expression where rows
are genes and columns are samples (cells). |
BETA_vec |
A vector of capture efficiencies
(probabilities) of cells.
If it is null, library size (total count) normalized to
0.06 will be used
as the input |
Conditions |
vector of condition labels, this should correspond to the columns of the Data. D efault is NULL, which assumes that all cells belong to the same group. |
UMI_sffl |
Scaling factors are required only for
non-UMI based data for which |
Prior_type |
Determines what groups of cells is used
in estimating prior using |
mode_version |
If TRUE, bayNorm return modes of posterior estimates as normalized data which is a 2D matrix rather than samples from posterior which is a 3D array. Default is FALSE. |
mean_version |
If TRUE, bayNorm return means of posterior estimates as normalized data, which is a 2D matrix rather than samples from posterior which is a 3D array. Default is FALSE. |
S |
The number of samples you would like to
generate from estimated posterior distribution
(The third dimension of 3D array). Default is 20.
S needs to be specified if |
parallel |
If TRUE, |
NCores |
number of cores to use, default is 5. This will be used to set up a parallel environment using either MulticoreParam (Linux, Mac) or SnowParam (Windows) with NCores using the package BiocParallel. |
FIX_MU |
Whether fix mu (the mean parameter of prior distribution) to its MME estimate, when estimating prior parameters by maximizing marginal distribution. If TRUE, then 1D optimization is used, otherwise 2D optimization for both mu and size is used (slow). Default is TRUE. |
GR |
If TRUE, the gradient function will be used in optimization. However since the gradient function itself is very complicated, it does not help too much in speeding up. Default is FALSE. |
BB_SIZE |
If TRUE, estimate size parameter of prior using maximization of marginal likelihood, and then use it for adjusting MME estimate of SIZE Default is TRUE. |
verbose |
print out status messages. Default is TRUE. |
out.sparse |
Only valid for mean version: Whether the output is of type dgCMatrix or not. Default is FALSE. |
A wrapper function of prior estimation and bayNorm function.
List containing 3D arrays of normalized
expression (if mode_version
=FALSE) or 2D matrix
of normalized expression (if mode_version
=TRUE
or mean_version
=TRUE),
a list contains estimated priors and a list contains
input parameters used: BETA_vec
,
Conditions
(if specified),
UMI_sffl
(if specified), Prior_type
,
FIX_MU
, BB_SIZE
and GR
.
Wenhao Tang, Francois Bertaux, Philipp Thomas, Claire Stefanelli, Malika Saint, Samuel Blaise Marguerat, Vahid Shahrezaei bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data Bioinformatics, btz726; doi: 10.1093/bioinformatics/btz726
data('EXAMPLE_DATA_list') #Return 3D array normalzied data: bayNorm_3D<-bayNorm( Data=EXAMPLE_DATA_list$inputdata[,seq(1,30)], BETA_vec = EXAMPLE_DATA_list$inputbeta[seq(1,30)], mode_version=FALSE,parallel =FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.