Remove unproductive CDR3 sequences from a single data frame.
A data frame consisting of antigen receptor sequencing data. "aminoAcid", "count", and "frequencyCount" are required columns.
Indicates whether the values of "count", "frequencyCount", and "esimatedNumberGenomes" should be aggregated by amino acid or nucleotide sequence. Acceptable values are "aminoAcid" or "nucleotide". If "aminoAcid" is selected, then the resulting data frame will have columns corresponding to "aminoAcid", "count", "frequnecyCount", and "estimatedNumberGenomes" (if this column is available). If "nucleotide" is selected then all columns in the original data frame will be present in the outputted data frame. The difference in output is due to the fact that the same amino acid CDR3 sequence may be encoded by multiple unique nucleotide sequences with differing V, D, and J genes.
Returns a data frame of productive amino acid sequences with recomputed values for "count", "frequencyCount", and "esimatedNumberGenomes". A productive sequences is defined as a sequence that is in frame and does not have an early stop codon.
1 2 3 4 5
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.