knitr::opts_chunk$set(echo = TRUE)
This guide will demonstrate the usage of bindSC to intergrate RNA and ATAC under the gene regulation modeling with expression level regulated by both proximal and distal peaks.
User can refer to sim.R
for the details of simulaiton process (It will take 5 mins).
library(bindSC) dim(sim$X) dim(sim$Y) dim(sim$Z0)
?BiCCA
In this example, dataset X
is the gene expression matrix and Y
is the peak matrix. Z0
is the simulated gene activity matrix. Seurat, Liger and Harmony use (X
, Z0
) as input and BiCCA uses (X
, Z0
, Y
) as input.
bindSC also requires the modality specfic clustering ID X.clst
,Y.clst
to test the integraiton performance
The option tolereance
is usually set from 0.01 to 0.05 to avoid unnecessary iteration when sample size is large.
This step will take ~10s
out <- BiCCA(X=sim$X, Z0=sim$Z0, Y=sim$Y, K = 5, alpha = 0.1, lambda = 0.5, X.clst = sim$X_meta$Group, Y.clst = sim$Y_meta$Group, num.iteration = 100, temp.path = "./out", tolerance = 0.01, save = TRUE, block.size =10000)
Show the iteration index delta
plot(out$delta, xlab="step", ylab="delta")
There are three key ouputs.The vectors u
and v
are the low-dimension reprentation of two modalites which can be used for downstream joint clustering. The matrix Z_est
is the updated gene score matrix after iterations.
summary(out)
dim(out$u) dim(out$r) dim(out$Z_est)
cell_type <- c(sim$X_meta$Group, sim$Y_meta$Group) data_type <- c(rep("A", dim(out$u)[1]), rep("B",dim(out$r)[1])) bindsc_umap <- umap(rbind(out$u, out$r)) plt_dt <- data.frame("UMAP1"=bindsc_umap$layout[,1], "UMAP2"=bindsc_umap$layout[,2], "cell_type"= as.factor(cell_type), "data_type"=data_type) p<-ggscatter(plt_dt, x = "UMAP1", y = "UMAP2", color = "cell_type", palette = c("darkseagreen4","lightpink","darkorchid1"), repel = FALSE,size=1, alpha=1,legend.title = "", title="bindSC", font.title=16) + facet_grid(~data_type) p
out <- readRDS(file="./out/iteration1.rds") cca_umap <- umap(rbind(out$u, out$r)) plt_dt <- data.frame("UMAP1"=cca_umap$layout[,1], "UMAP2"=cca_umap$layout[,2], "cell_type"= as.factor(cell_type), "data_type"=data_type) # remove some outliers from CCA p<-ggscatter(plt_dt[(plt_dt$UMAP1>(-20) & plt_dt$UMAP2<20), ], x = "UMAP1", y = "UMAP2", color = "cell_type", palette = c("darkseagreen4","lightpink","darkorchid1"), repel = FALSE,size=1, alpha=1,legend.title = "", title="CCA", font.title=16) + facet_grid(~data_type) p
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.