productiveSeq: Select productive sequences

View source: R/readProductiveSeq.R

productiveSeqR Documentation

Select productive sequences

Description

productiveSeq() Select productive nucleotide/amino acid CDR3 sequences from a tibble containing raw AIRR formatted data. Aggregation of the raw data is either done on the productive CDR3 amino acid sequence (junction_aa) or the productive CDR3 nucleotide sequence (junction). If "junction_aa" is selected, then resulting tibble will display the most frequently observed. V, D, J gene that were associated with the formation of the productive CDR3 amino acid sequence. If "junction" is selected then all columns in the original list will be present in the outputted list. The difference in output is due to the fact that the same amino acid CDR3 sequence may be encoded by multiple unique junction sequences with differing V, D, and J genes.

Usage

productiveSeq(study_table, aggregate = "junction_aa", prevalence = FALSE)

Arguments

study_table

A tibble consisting antigen receptor sequencing data imported by the LymphoSeq2 function readImmunoSeq(). "junction_aa", "duplicate_count", and "duplicate_frequency" are required columns

aggregate

Indicates whether the values of "duplicate_count" and "duplicate_frequency" should be aggregated by amino acid or junction sequence. Acceptable values are "junction_aa" or "junction"

prevalence

A Boolean value

  • TRUE : Add a new column the study table giving the prevalence of each CDR3 amino acid sequence in 55 healthy donor peripheral blood samples.

  • FALSE (the default): Do not add prevelance information

Value

Returns a list of data frames of productive amino acid sequences with recomputed values for "duplicate_count", "duplicate_frequency". A productive sequences is defined as a sequences that is in frame and does not have an early stop codon.

Examples

file_path <- system.file("extdata", "TCRB_sequencing", 
 package = "LymphoSeq2")
study_table <- LymphoSeq2::readImmunoSeq(path = file_path, threads = 1)
study_table <- LymphoSeq2::topSeqs(study_table, top = 100)
amino_table <- LymphoSeq2::productiveSeq(
  study_table = study_table,
  aggregate = "junction_aa",
  prevalence = TRUE
)
nucleotide_table <- LymphoSeq2::productiveSeq(
  study_table = study_table,
  aggregate = "junction",
  prevalence = FALSE
)

shashidhar22/LymphoSeq2 documentation built on Jan. 16, 2024, 4:29 a.m.