spotclean | R Documentation |
This is the main function implementing the SpotClean method for decontaminating spot swapping effect in spatial transcriptomics data.
spotclean(slide_obj, ...)
## S3 method for class 'SummarizedExperiment'
spotclean(
slide_obj,
gene_keep = NULL,
maxit = 30,
tol = 1,
candidate_radius = 5 * seq_len(6),
kernel = "gaussian",
verbose = TRUE,
...
)
## S3 method for class 'SpatialExperiment'
spotclean(
slide_obj,
gene_keep = NULL,
gene_cutoff = 0.1,
maxit = 30,
tol = 1,
candidate_radius = 5 * seq_len(6),
kernel = "gaussian",
verbose = TRUE,
...
)
slide_obj |
A slide object created or inherited from
|
... |
Arguments passed to other methods |
gene_keep |
(vector of chr) Gene names to keep for decontamination.
We recommend not decontaminating lowly expressed and lowly variable genes
in order to save computation time. Even if user include them, their
decontaminated expressions will not change too much from raw expressions.
When setting to |
maxit |
(int) Maximum iteration for EM parameter updates. Default: 30. |
tol |
(num) Tolerance to define convergence in EM parameter updates.
When the element-wise maximum difference between current and updated
parameter matrix is less than |
candidate_radius |
(vector of num) Candidate contamination radius. A series of radius to try when estimating contamination parameters. Default: c(5, 10, 15, 20, 25, 30) |
kernel |
(chr): name of kernel to use to model local contamination. Supports "gaussian", "linear", "laplace", "cauchy". Default: "gaussian". |
verbose |
(logical) Whether print progress information.
Default: |
gene_cutoff |
(num) Filter out genes with average expressions
among tissue spots below or equal to this cutoff.
Only applies to |
Briefly, the contamination level for the slide is estimated based on the total counts of all spots. UMI counts travelling around the slide are assumed to follow Poisson distributions and modeled by a mixture of Gaussian (proximal) and uniform (distal) kernels. The underlying uncontaminated gene expressions are estimated by EM algorithm to maximize the data likelihood. Detailed derivation can be found in our manuscript.
For slide object created from createSlide()
, returns a
slide object where the decontaminated expression matrix is in the
"decont" assay slot and the contamination statistics are in
metadata slots. Contamination statistics include ambient RNA contamination
(ARC) score, bleeding rate, distal rate, contamination radius,
contamination kernel weight matrix, log-likelihood value in each iteration,
estimated proportion of contamination in each tissue spot in observed data.
Since decontaminated and raw data have different number of columns, they can
not be stored in a single object.
For SpatialExperiment
object created from
SpatialExperiment::read10xVisium()
, returns a
SpatialExperiment
object where the decontaminated expression matrix
is in the "decont" assay slot and the contamination statistics are in
metadata slots. Raw expression matrix is also stored in the "counts" assay
slot. Genes are filtered based on gene_cutoff
.
data(mbrain_raw)
spatial_dir <- system.file(file.path("extdata",
"V1_Adult_Mouse_Brain_spatial"),
package = "SpotClean")
mbrain_slide_info <- read10xSlide(tissue_csv_file=file.path(spatial_dir,
"tissue_positions_list.csv"),
tissue_img_file = file.path(spatial_dir,
"tissue_lowres_image.png"),
scale_factor_file = file.path(spatial_dir,
"scalefactors_json.json"))
mbrain_obj <- createSlide(mbrain_raw,
mbrain_slide_info)
mbrain_decont_obj <- spotclean(mbrain_obj, tol=10, candidate_radius=20)
mbrain_decont_obj
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.