Single-cell and paired chain data: scTCRseq and scBCRseq exploration
In immunarch: Bioinformatics Analysis of T-Cell and B-Cell Immune Repertoires

# knitr::knit_hooks$set(optipng = knitr::hook_optipng)
# knitr::opts_chunk$set(optipng = '-o7')

knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(fig.align = "center")
knitr::opts_chunk$set(fig.width = 18)
knitr::opts_chunk$set(fig.height = 12)

library(immunarch)
data(scdata)

Executive Summary

This is a vignette dedicated to provide an overview on how to work with single-cell paired chain data in immunarch

Single-cell support is currently in the development version. In order to access it, you need to install the latest development version of the package by executing the following command:

install.packages("devtools"); devtools::install_github("immunomind/immunarch", ref="dev")

To read paired chain data with immunarch use the repLoad function with .mode = "paired". Currently we support 10X Genomics only.

To create subset immune repertoires with specific barcodes use the select_barcodes function. Output of Seurat::Idents() as a barcode vector works.

To create cluster-specific and patient-specific datasets using barcodes from the output of Seurat::Idents() use the select_clusters function.

Use the data packaged with `immunarch`

Load the package into the R enviroment:

library(immunarch)

For testing purposes we attached a new paired chain dataset to immunarch. Load it by executing the following command:

data(scdata)

Load the paired chain data

To load your own datasets, use the repLoad function. Currently we implemented paired chain data support for 10X Genomics data only. A working example of loading datasets into R:

file_path <- paste0(system.file(package = "immunarch"), "/extdata/sc/flu.csv.gz")
igdata <- repLoad(file_path, .mode = "paired")

igdata$meta

head(igdata$data[[1]][c(1:7, 16, 17)])

Subset by barcodes

To subset the data by barcodes, use the select_barcodes function.

barcodes <- c("AGTAGTCAGTGTACTC-1", "GGCGACTGTACCGAGA-1", "TTGAACGGTCACCTAA-1")

new_df <- select_barcodes(scdata$data[[1]], barcodes)

new_df

Patient-specific datasets

To create a new dataset with cluster-specific immune repertoires, use the select_clusters function:

scdata_pat <- select_clusters(scdata, scdata$bc_patient, "Patient")

names(scdata_pat$data)

scdata_pat$meta

Cluster-specific datasets

To create a new dataset with cluster-specific immune repertoires, use the select_clusters function. You can apply this function after you created patient-specific datasets to get patient-specific cell cluster-specific immune repertoires, e.g., a Memory B Cell repertoire for a specific patient:

scdata_cl <- select_clusters(scdata_pat, scdata$bc_cluster, "Cluster")

names(scdata_cl$data)

scdata_cl$meta

Explore and compute statistics

Most functions will work out-of-the-box with paired chain data.

p1 <- repOverlap(scdata_cl$data) %>% vis()
p2 <- repDiversity(scdata_cl$data) %>% vis()

target <- c("CARAGYLRGFDYW;CQQYGSSPLTF", "CARATSFYYFHHW;CTSYTTRTTLIF", "CARDLSRGDYFPYFSYHMNVW;CQSDDTANHVIF", "CARGFDTNAFDIW;CTAWDDSLSGVVF", "CTREDYW;CMQTIQLRTF")
p3 <- trackClonotypes(scdata_cl$data, target, .col = "aa") %>% vis()

(p1 + p2) / p3