knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The goal of rTCRBCRr is to process the results from clonotyping tools such as trust, mixcr, and immunoseq to analyze the clonotype repertoire metrics
The package is accepted by the CRAN, you can install the released version of rTCRBCRr from CRAN with:
install.packages("rTCRBCRr")
You can also install the development version from GitHub with:
# install.packages("devtools") devtools::install_github("sciencepeak/rTCRBCRr")
library("rTCRBCRr") library("magrittr") library("readr")
present_tool <- c("trust", "mixcr")[1] example_data_directory <- system.file(paste("extdata", present_tool, sep = "/"), package = "rTCRBCRr") input_paths <- dir(example_data_directory, full.names = TRUE) input_files <- dir(example_data_directory, full.names = FALSE) input_files sample_names <- sub(".tsv.*", "", input_files) sample_names raw_clonotype_dataframe_list <- lapply(input_paths, readr::read_tsv) %>% magrittr::set_names(., value = sample_names) raw_clonotype_dataframe_list
The tidy-up consists of four steps, namely four functions:
# If you only want to test one sample, you can process the only sample as follows. the_divergent_clonotype_dataframe <- raw_clonotype_dataframe_list[["sample_01"]] %>% format_clonotype_to_immunarch_style(., clonotyping_tool = present_tool) %>% remove_nonproductive_CDR3aa %>% annotate_chain_name_and_isotype_name %>% merge_convergent_clonotype # Then the only one sample should be put into a list, element of which uses the sample name, # because the later step need a named list of data frames as input. divergent_clonotype_dataframe_list <- list(sample_01 = the_divergent_clonotype_dataframe) # Otherwise, normally you will have multiple samples, # then functional style of processing is preferred as follows. divergent_clonotype_dataframe_list <- raw_clonotype_dataframe_list %>% lapply(., format_clonotype_to_immunarch_style, clonotyping_tool = present_tool) %>% lapply(., remove_nonproductive_CDR3aa) %>% lapply(., annotate_chain_name_and_isotype_name) %>% lapply(., merge_convergent_clonotype)
This step consists of three functions.
# handle repertoire metrics for all the chains. all_sample_all_chain_all_metrics_wide_format_dataframe_list <- the_divergent_clonotype_dataframe_list %>% lapply(., compute_repertoire_metrics_by_chain_name) all_sample_all_chain_all_metrics_wide_format_dataframe_list all_sample_all_chain_all_metrics_wide_format_dataframe <- all_sample_all_chain_all_metrics_wide_format_dataframe_list %>% combine_all_sample_repertoire_metrics all_sample_all_chain_all_metrics_wide_format_dataframe all_sample_all_chain_individual_metrics_dataframe_list <- all_sample_all_chain_all_metrics_wide_format_dataframe %>% get_item_name_x_sample_name_for_each_metric all_sample_all_chain_individual_metrics_dataframe_list
This step consists of three functions.
# handle repertoire metrics all all the isotypes of IGH chain. all_sample_IGH_chain_all_metrics_wide_format_dataframe_list <- the_divergent_clonotype_dataframe_list %>% lapply(., calculate_IGH_isotype_proportion) all_sample_IGH_chain_all_metrics_wide_format_dataframe_list all_sample_IGH_chain_all_metrics_wide_format_dataframe <- all_sample_IGH_chain_all_metrics_wide_format_dataframe_list %>% combine_all_sample_repertoire_metrics all_sample_IGH_chain_all_metrics_wide_format_dataframe all_sample_IGH_chain_individual_metrics_dataframe_list <- all_sample_IGH_chain_all_metrics_wide_format_dataframe %>% get_item_name_x_sample_name_for_each_metric all_sample_IGH_chain_individual_metrics_dataframe_list
The repertoire metrics formula including richness, diversity (Shannon entropy), evenness (Pielou's eveness), clonality, and median (frequency median) were defined as follows, where $p_i$ is the frequency of ${\rm clonotype}_i$ in a sample with $N$ unique clonotypes (Khunger, Rytlewski et al. 2019, Looney, Topacio-Hall et al. 2020). $P$ is the frequency vector of unique clonotypes in a sample.
$$ richness\ =\ N $$
$$ Shannon\ entropy=-\sum_{i=1}^{N}{p_i\log_2{\left(p_i\right)}} $$
$$ Pielou\prime s\ eveness\ =\ \frac{Shannon\ entropy}{\log_2{N}} $$
$$ clonality\ =\ 1\ -\ Pielou\prime s\ evenness $$
$$ frequency\ median\ =\ median(P) $$
The function calculate_repertoire_metrics
is essential to implement the repertoire metrics formulas
calculate_repertoire_metrics
The hexagon logo of the package was created with the help of the package hexSticker. The math formula was written with the help of recognition tool MyScript. The latex formula in markdown was inspired by rmd4sci. The code in this study was inspired by the UCSB R tutorial note, LymphoSeq script, and vegan package.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.