cluster_scoring | R Documentation |
With this method the scores of cdr3 clusters are calculated as in the GLIPH and GLIPH2 algorithm. Depending on the information provided, a final score is calculated based on up to five cluster properties: cluster size, enrichment of cdr3 lengths, enrichment of V genes, enrichment of clonal expansions and enrichment of a common HLA alleles.
cluster_scoring( cluster_list, cdr3_sequences, refdb_beta = "gliph_reference", v_usage_freq = NULL, cdr3_length_freq = NULL, ref_cluster_size = "original", gliph_version = 1, sim_depth = 1000, hla_cutoff = 0.1, n_cores = 1 )
cluster_list |
list. Each element of this list contains a data frame in which the CDR3b sequences and additional information necessary for
scoring are provided. Corresponds to the |
cdr3_sequences |
vector or dataframe. This dataframe must contain the cdr3 sequences and optional additional information. The columns must be named as specified in the following list in arbitrary order.
|
refdb_beta |
character or data frame. By default
|
v_usage_freq |
data frame. By default |
cdr3_length_freq |
data frame. By default |
ref_cluster_size |
character. Either |
gliph_version |
numeric. Either |
sim_depth |
numeric. By default 1000. Simulated resampling depth for non-parametric convergence significance tests. A higher number will take longer to run but will produce more reproducible results. |
hla_cutoff |
numeric. By default 0.1. Defines the threshold of HLA probability scores below which HLA alleles are considered significant. |
n_cores |
numeric. Number of cores to use, by default 1. In case of |
This function produces one file in the result_folder
named "GLIPH_scoring_results.txt"
containing the same information as in the returned data frame.
The data frame contains the cluster scoring results. The first columns provides the representative_seq
for any evaluated cluster.
In the second column the total scores are stored. Additional columns contain up to five scores (cluster size, cdr3 length enrichment, V-gene enrichment,
enrichment of clonal expansion and enrichment of common HLA) used to evaluate the total score.
Glanville, Jacob, et al. "Identifying specificity groups in the T cell receptor repertoire." Nature 547.7661 (2017): 94.
https://github.com/immunoengineer/gliph
utils::data("gliph_input_data") res <- turbo_gliph(cdr3_sequences = gliph_input_data[base::seq_len(200),], sim_depth = 100, n_cores = 1) scoring_results <- cluster_scoring(cluster_list = res$cluster_list, cdr3_sequences = gliph_input_data[base::seq_len(200),], refdb_beta = "gliph_reference", gliph_version = 1, sim_depth = 100, n_cores = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.