coding: Filter out coding and non-coding clonotype sequences

View source: R/preprocessing.R

codingR Documentation

Filter out coding and non-coding clonotype sequences

Description

Filter out clonotypes with non-coding, coding, in-frame or out-of-frame CDR3 sequences:

'coding()' - remove all non-coding sequences (i.e., remove all sequences with stop codons and frame shifts);

'noncoding()' - remove all coding sequences (i.e., leave sequences with stop codons and frame shifts only);

'inframes()' - remove all out-of-frame sequences (i.e., remove all sequences with frame shifts);

'outofframes()' - remove all in-frame sequences (i.e., leave sequences with frame shifts only).

Note: the function will remove all clonotypes sequences with NAs in the CDR3 amino acid column.

Usage

coding(.data)

noncoding(.data)

inframes(.data)

outofframes(.data)

Arguments

.data

The data to be processed. Can be data.frame, data.table, or a list of these objects.

Every object must have columns in the immunarch compatible format. immunarch_data_format

Competent users may provide advanced data representations: DBI database connections, Apache Spark DataFrame from copy_to or a list of these objects. They are supported with the same limitations as basic objects.

Note: each connection must represent a separate repertoire.

Value

Filtered data frame.

Examples

data(immdata)
immdata_cod <- coding(immdata$data)
immdata_cod1 <- coding(immdata$data[[1]])

immunomind/immunarch documentation built on March 20, 2024, 12:01 p.m.