scMerge | R Documentation |
Merge single-cell RNA-seq data from different batches and experiments leveraging (pseudo)-replicates and control genes.
scMerge(
sce_combine,
ctl = NULL,
kmeansK = NULL,
exprs = "logcounts",
hvg_exprs = "counts",
batch_name = "batch",
marker = NULL,
marker_list = NULL,
ruvK = 20,
replicate_prop = 1,
cell_type = NULL,
cell_type_match = FALSE,
cell_type_inc = NULL,
BSPARAM = ExactParam(),
svd_k = 50,
dist = "cor",
WV = NULL,
WV_marker = NULL,
BPPARAM = SerialParam(),
return_all_RUV = FALSE,
assay_name = NULL,
plot_igraph = TRUE,
verbose = FALSE
)
sce_combine |
A |
ctl |
A character vector of negative control. It should have a non-empty intersection with the rows of sce_combine. |
kmeansK |
A vector indicates the kmeans's K for each batch. The length of kmeansK needs to be the same as the number of batch. |
exprs |
A string indicating the name of the assay requiring batch correction in sce_combine, default is logcounts. |
hvg_exprs |
A string indicating the assay that to be used for highly variable genes identification in sce_combine, default is counts. |
batch_name |
A character indicating the name of the batch column, default to "batch" |
marker |
An optional vector of markers, to be used in calculation of mutual nearest cluster. If no markers input, highly variable genes will be used instead. |
marker_list |
An optional list of markers for each batch, which will be used in calculation of mutual nearest cluster. |
ruvK |
An optional integer/vector indicating the number of unwanted variation factors that are removed, default is 20. |
replicate_prop |
A number indicating the ratio of cells that are included in pseudo-replicates, ranges from 0 to 1. Default to 1. |
cell_type |
An optional vector indicating the cell type information for each cell in the batch-combined matrix. If it is |
cell_type_match |
An optional logical input for whether to find mutual nearest cluster using cell type information. |
cell_type_inc |
An optional vector indicating the indices of the cells that will be used to supervise the pseudo-replicate procedure. |
BSPARAM |
A |
svd_k |
If BSPARAM is set to |
dist |
The distance metrics that are used in the calculation of the mutual nearest cluster, default is Pearson correlation. |
WV |
A optional vector indicating the wanted variation factor other than cell type info, such as cell stages. |
WV_marker |
An optional vector indicating the markers of the wanted variation. |
BPPARAM |
A |
return_all_RUV |
If |
assay_name |
The assay name(s) for the adjusted expression matrix(matrices). If |
plot_igraph |
If |
verbose |
If |
Returns a SingleCellExperiment
object with following components:
assays: the original assays and also the normalised matrix
metadata: containing the ruvK vector, ruvK_optimal based on F-score, and the replicate matrix
Yingxin Lin, Kevin Wang
## Loading example data
data('example_sce', package = 'scMerge')
## Previously computed stably expressed genes
data('segList_ensemblGeneID', package = 'scMerge')
## Running an example data with minimal inputs
sce_mESC <- scMerge(sce_combine = example_sce,
ctl = segList_ensemblGeneID$mouse$mouse_scSEG,
kmeansK = c(3, 3),
assay_name = 'scMerge')
sce_mESC = scater::runPCA(sce_mESC, exprs_values = "logcounts")
scater::plotPCA(sce_mESC, colour_by = 'cellTypes', shape = 'batch')
sce_mESC = scater::runPCA(sce_mESC, exprs_values = 'scMerge')
scater::plotPCA(sce_mESC, colour_by = 'cellTypes', shape = 'batch')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.