productive: Productive sequences
In LymphoSeq: Analyze high-throughput sequencing of T and B cell receptors

Description Usage Arguments Value See Also Examples

Remove unproductive CDR3 sequences from a single data frame.

1	productive(sample, aggregate = "aminoAcid")

sample

A data frame consisting of antigen receptor sequencing data. "aminoAcid", "count", and "frequencyCount" are required columns.

aggregate

Indicates whether the values of "count", "frequencyCount", and "esimatedNumberGenomes" should be aggregated by amino acid or nucleotide sequence. Acceptable values are "aminoAcid" or "nucleotide". If "aminoAcid" is selected, then the resulting data frame will have columns corresponding to "aminoAcid", "count", "frequnecyCount", and "estimatedNumberGenomes" (if this column is available). If "nucleotide" is selected then all columns in the original data frame will be present in the outputted data frame. The difference in output is due to the fact that the same amino acid CDR3 sequence may be encoded by multiple unique nucleotide sequences with differing V, D, and J genes.

Returns a data frame of productive amino acid sequences with recomputed values for "count", "frequencyCount", and "esimatedNumberGenomes". A productive sequences is defined as a sequence that is in frame and does not have an early stop codon.

productiveSeq

file.path <- system.file("extdata", "TCRB_sequencing", package = "LymphoSeq")

file.list <- readImmunoSeq(path = file.path)

productive <- productive(sample = file.list[["TRB_Unsorted_32"]], aggregate = "aminoAcid")