```r
knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(fig.align = "center") knitr::opts_chunk$set(fig.width = 12) knitr::opts_chunk$set(fig.height = 5)
library(immunarch) data(immdata)
# Tracking of clonotypes Clonotype tracking is popular approach to monitor changes in frequency of clonotypes of interest in vaccination and cancer immunology. For example, a researcher can track a clonotype of across different time points in pre- and post-vaccination repertoires. Or analyse growth of malignant clonotypes in tumor sample. Various methods of clonotype tracking are integrated into the one `trackClonotypes` function. Currently, there are three methods to choose from. The output of `trackClonotypes` can be immediately visualised with the `vis` function. ## Tracking the most abundant clonotypes The simplest approach is to choose the most abundant clonotypes from one of the input immune repertoires and track across all immune repertoires in a batch. Arguments `.which` and `.col` are used to choose the immune repertoire, the number of clonotypes to take from it and which columns to use. To choose the 10 most abundant clonotypes from the first repertoire and track them using their CDR3 nucleotide sequence: ```r tc1 <- trackClonotypes(immdata$data, list(1, 5), .col = "nt")
Value list(1, 5)
of the .which
argument (second argument) means to choose 10 clonotypes from the 1st repertoire in the input list of repertoires immdata$data
. Value "nt"
of the .col
argument means that the function should take CDR3 nucleotide sequences only.
To choose the 10 most abundant amino acid clonotype sequences and their V genes from the "MS1" repertoire to track:
tc2 <- trackClonotypes(immdata$data, list("MS1", 10), .col = "aa+v")
Value list("MS1", "10")
of the .which
argument means to choose 10 clonotypes from the repertoire named "MS1" in the input list of repertoires immdata$data
. Value "aa+v"
of the .col
argument means that the function should take both CDR3 amino acid sequences and V gene segments of the most abundant clonotypes.
Visualisation of both approaches:
p1 <- vis(tc1) p2 <- vis(tc2) p1 / p2
In order to track specific clonotype sequences, you can provide nucleotide or amino acid sequences as the .which
argument, along with the column .col
specifying in which columns to search for sequences. For example, to track seven CDR3 amino acid sequences specifyed below you need to execute the following code:
target <- c("CASSLEETQYF", "CASSDSSGGANEQFF", "CASSDSSGSTDTQYF", "CASSLAGGYNEQFF", "CASSDSAGGTDTQYF", "CASSLDSYEQYF", "CASSSAGGYNEQFF") tc <- trackClonotypes(immdata$data, target, .col = "aa") vis(tc)
An improvement over the previous approach, it is possible to track clonotypes using information about both sequences and gene segments. Support you have a data frame of sequences with specific CDR3 sequences and gene segments. We will simulate this by choosing the 10 most abundant clonotypes from the first repertoire in the batch:
target <- immdata$data[[1]] %>% select(CDR3.aa, V.name) %>% head(10) target
Supply this data frame as an argument value to the .which
argument to track target clonotypes:
tc <- trackClonotypes(immdata$data, target) vis(tc)
Note that you can use any columns in the target
data frame, such as both CDR3 nucleotide and amino acid sequences and any gene segments.
There are three ways to visualise clonotype tracking, depending on your research and aesthetic needs. To choose the type of plot, you need to provide the ".plot"
parameter to the vis()
function, specifying one of three plot types:
- .plot = "smooth"
- used by default, a visualisation using smooth lines and stacked bar plots;
- .plot = "area"
- visualise abundances using areas under the abundance lines;
- .plot = "line"
- visualise only lines, connecting levels of abundances of a same clonotype between time points.
target <- c("CASSLEETQYF", "CASSDSSGGANEQFF", "CASSDSSGSTDTQYF", "CASSLAGGYNEQFF", "CASSDSAGGTDTQYF", "CASSLDSYEQYF", "CASSSAGGYNEQFF") tc <- trackClonotypes(immdata$data, target, .col = "aa") vis(tc, .plot = "smooth") vis(tc, .plot = "area") vis(tc, .plot = "line")
The .order
argument of the vis
function controls the order of samples in the visualisation. You can pass either indices of samples you plan to visualise or sample names.
# Passing indices names(immdata$data)[c(1, 3, 5)] # check sample names vis(tc, .order = c(1, 3, 5)) # You can change the order vis(tc, .order = c(5, 1, 3)) # Passing sample names vis(tc, .order = c("A2-i129", "A2-i133", "A4-i191"))
If your metadata contains information about time such as timepoints for vaccination or tumor samples, you can use it to re-order samples accordingly. In our examples immdata$meta
does not contain information about timepoints, so we will simulate this case.
First, we create an additional column in the metadata with randomly choosen timepoints:
immdata$meta$Timepoint <- sample(1:length(immdata$data)) immdata$meta
Next, we create a vector with samples in the right order, according to the "Timepoint" column (from smallest to greatest):
sample_order <- order(immdata$meta$Timepoint)
Sanity check: timepoints are following the right order:
immdata$meta$Timepoint[sample_order]
Samples, sorted by the timepoints:
immdata$meta$Sample[sample_order]
And finally, we visualise the data:
vis(tc, .order = sample_order)
It is possible to create a one-liner with the full pipeline from ordering to plotting:
vis(tc, .order = order(immdata$meta$Timepoint))
If you want to change the colour palette, add a ggplot2 scale_fill_*
function to the plot. We recommend using scale_fill_brewer
:
vis(tc) + scale_fill_brewer(palette = "Spectral") vis(tc) + scale_fill_brewer(palette = "RdBu")
Run ?scale_fill_brewer
in the R console to learn more about ColorBrewer and it's colour schemes.
Can not find an important feature? Have a question or found a bug? Contact us at support@immunomind.io
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.