knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The advent of single-cell sequencing technologies in profiling 3D genome organization led to development of single-cell high-throughput chromatin conformation (scHi-C) assays. Data from these assays enhance our ability to study dynamic chromatin architecture and the impact of spatial genome interactions on cell regulation at an unprecedented resolution. At the individual cell resolution, heterogeneity driven by the stochastic nature of chromatin fiber, various nuclear processes, and unwanted variation due to sequencing depths and batch effects poses major analytical challenges for inferring single cell-level 3D genome organizations.
To explicitly capture chromatin conformation features and distinguish cells based on their 3D genome organizations, we develop a simple and fast band normalization approach, BandNorm. BandNorm first removes genomic distance bias within a cell, and sequencing depth normalizes between cells. Consequently, BandNorm adds back a common band-dependent contact decay profile for the contact matrices across cells. The former step is achieved by dividing the interaction frequencies of each band within a cell with the cell’s band mean. The latter step is implemented by multiplying each scaled band by the average band mean across cells.

To use BandNorm, the following packages are necessary:
install.packages(c('ggplot2', 'dplyr', 'data.table', 'Rtsne', 'umap')) if (!requireNamespace("devtools", quietly=TRUE)) install.packages("devtools") devtools::install_github("immunogenomics/harmony")
BandNorm can be installed from Github:
devtools::install_github('sshen82/BandNorm', build_vignettes = FALSE)
Also, if the user doesn't have default R path writing permission, it is possible to install the package to the personal library:
withr::with_libpaths("personalLibraryPath", devtools::install_github("sshen82/BandNorm"))
There are three possible ways to input the data:
chr1 0 chr1 1000000 9 chr1 1000000 chr1 1000000 200 chr1 0 chr1 2000000 2 chr1 1000000 chr1 2000000 4 chr1 2000000 chr1 2000000 220 chr1 1000000 chr1 3000000 1 chr1 2000000 chr1 3000000 11 chr1 3000000 chr1 3000000 197 chr1 1000000 chr1 4000000 1 chr1 2000000 chr1 4000000 2
library(BandNorm) data(hic_df) print(hic_df[1:10, ])
For the example dataset, we also have batch information and cell-type information. Note that the cell_name in those external information should be consistent with the hic_df file.
data(batch) data(cell_type) print(batch[1:10, ]) print(cell_type[1:10, ])
You can also use download_schic function to download real data. The available data includes Kim2020, Li2019, Ramani2017, and Lee2019. You can specify one of the cell-types in those cell lines. There are also summary files available for them, and the summary includes batch, cell-type, depth and sparsity information. You can go to this website for more information. We won't run it here since it will mess up your computer.
# download_schic("Li2019", cell_type = "2i", cell_path = getwd(), summary_path = getwd())
We provide a demo scHi-C data named hic_df, sampled 400 cells of Astro, ODC, MG, Sst, four cell types and chromosome 1 from Ecker2019 for test run. After obtaining the hic_df file, you can use the bandnorm function to normalize the data. The result consists of 6 columns: chromosome, binA, binB, diag (which is binB - binA), cell (which is the name of the cell), and BandNorm. You can use the save option to save the normalized file, and if so, you need to give a path for output.
bandnorm_result = bandnorm(hic_df = hic_df, save = FALSE) bandnorm_result[1:10, ] # bandnorm_result = bandnorm(hic_df = hic_df, save = TRUE, save_path = getwd())
Then, you can use create_embedding function to obtain a PCA embedding for the data. You can choose to use Harmony to remove the batch effect.
embedding = create_embedding(hic_df = bandnorm_result, do_harmony = TRUE, batch = batch, chrs = "chr1") embedding[1:10, ]
Finally, using the embedding, you can use UMAP or tSNE to get a projection of all the cells on the 2D plane. If you have the cell-type information or batch information, it is possible to color them on the plot.
plot_embedding(embedding, "UMAP", cell_info = cell_type, label = "Cell Type") plot_embedding(embedding, "UMAP", cell_info = batch, label = "Batch")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.