yassai_identifier

Description

TCR clonotype identifier (Yassai et al.)

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
yassai_identifier(data, V_after_C, J_before_FGxG, long = FALSE)

## S4 method for signature 'character,data.frame,data.frame,ANY'
yassai_identifier(data,
  V_after_C, J_before_FGxG, long = FALSE)

## S4 method for signature 'ANY,missing,missing,ANY'
yassai_identifier(data, long)

## S4 method for signature 'data.frame,data.frame,data.frame,logical'
yassai_identifier(data,
  V_after_C, J_before_FGxG, long = FALSE)

Arguments

data

A data frame or a character vector containing a clonotype(s) with proper row or element names.

V_after_C

(optional) A data frame indicating the aminoacids following the conserved cystein for each V segment.

J_before_FGxG

(optional) A data frame indicating the aminoacids preceding the conserved FGxG motif for each V segment.

long

(optional) Avoids identifier collisions by displaying the codons, and indicating the position of the V–J junction in ambiguous cases.

Details

The clonotype nomenclature defined by Yassai et al. in http://dx.doi.org/10.1007/s00251-009-0383-x.

By default, yassai_identifier() assume mouse sequences and will load the V_after_C and J_before_FGxG tables distributed in this package. It is possible to provide alternative tables either by passing them directly as argument, or by installing them as “./inst/extdata/V_after_C.txt.gz” and “./inst/extdata/J_before_FGxG.txt.gz”.

Some clonotypes have a different DNA sequence but the same identifier following the original nomenclature (see below for examples). The ‘long’ mode was created to avoid these collisions. First, it displays all codons, instead of only the non-templated ones and their immediate neighbors. Second, for the clonotypes where all codons are identical to the V or J germline sequence, it indicates the position of the V–J junction in place of the codon IDs.

Value

The name (for instance sIRSSy.1456B19S1B27L11) consists of five segments:

  1. CDR3 amino acid identifier (ex. sIRSSy), followed by a dot;

  2. CDR3 nucleotide sequence identifier (ex. 1456);

  3. variable (V) segment identifier (ex. BV19S1);

  4. joining (J) segment identifier (ex. BJ2S7);

  5. CDR3 length identifier (ex. L11).

Methods (by class)

  • data = character,V_after_C = data.frame,J_before_FGxG = data.frame,long = ANY: TCR clonotype identifier (Yassai et al.)

  • data = ANY,V_after_C = missing,J_before_FGxG = missing,long = ANY: TCR clonotype identifier (Yassai et al.)

  • data = data.frame,V_after_C = data.frame,J_before_FGxG = data.frame,long = logical: TCR clonotype identifier (Yassai et al.)

See Also

codon_ids, J_before_FGxG, V_after_C

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
clonotypes <- read_clonotypes(system.file('extdata', 'clonotypes.txt.gz', package = "clonotypeR"))
head(yassai_identifier(clonotypes))

# The following two clonotypes have a the same identifier, and are
# disambiguated by using the long mode

yassai_identifier(c(V="TRAV14-1", J="TRAJ43", dna="GCAGCTAATAACAACAATGCCCCACGA", pep="AANNNNAPR"))
# [1] "aAn.1A14-1A43L9"

yassai_identifier(c(V="TRAV14-1", J="TRAJ43", dna="GCAGCAGCTAACAACAATGCCCCACGA", pep="AAANNNAPR"))
# [1] "aAn.1A14-1A43L9"

yassai_identifier(c(V="TRAV14-1", J="TRAJ43", dna="GCAGCTAATAACAACAATGCCCCACGA", pep="AANNNNAPR"), long=TRUE)
# [1] "aAnnnnapr.1A14-1A43L9"

yassai_identifier(c(V="TRAV14-1", J="TRAJ43", dna="GCAGCAGCTAACAACAATGCCCCACGA", pep="AAANNNAPR"), long=TRUE)
# [1] "aaAnnnapr.1A14-1A43L9"

# The following two clonotypes would have the same identifier in long mode
# if the position of the V-J junction would not be indicated in place of the
# codon IDs.

yassai_identifier(c(V="TRAV14N-1", J="TRAJ56", dna="GCAGCTACTGGAGGCAATAATAAGCTGACT", pep="AATGGNNKLT"), long=TRUE)
# [1] "aatggnnklt.1A14N1A56L10"

yassai_identifier(c(V="TRAV14N-1", J="TRAJ56", dna="GCAGCAACTGGAGGCAATAATAAGCTGACT", pep="AATGGNNKLT"), long=TRUE)
# [1] "aatggnnklt.2A14N1A56L10"