| filterInputData | R Documentation | 
Given a data frame with a column containing receptor sequences, filter data rows by sequence length and sequence content. Keep all data columns or choose which columns to keep.
filterInputData(
  data,
  seq_col,
  min_seq_length = NULL,
  drop_matches = NULL,
  subset_cols = NULL,
  count_col = NULL,
  verbose = FALSE
)
| data | A data frame. | 
| seq_col | Specifies the column(s) of  | 
| min_seq_length | Observations whose receptor sequences have fewer than  | 
| drop_matches | Accepts a character string containing a regular expression
(see  | 
| subset_cols | Specifies which columns of the AIRR-Seq data are included in the output.
Accepts a character vector of column names
or a numeric vector of column indices.
The default
 | 
| count_col | Optional. Specifies the column of  | 
| verbose | Logical. If  | 
A data frame.
Brian Neal (Brian.Neal@ucsf.edu)
Hai Yang, Jason Cham, Brian Neal, Zenghua Fan, Tao He and Li Zhang. (2023). NAIR: Network Analysis of Immune Repertoire. Frontiers in Immunology, vol. 14. doi: 10.3389/fimmu.2023.1181825
set.seed(42)
raw_data <- simulateToyData()
# Remove sequences shorter than 13 characters,
# as well as sequences containing the subsequence "GGGG".
# Keep variables for clone sequence, clone frequency and sample ID
filterInputData(
  raw_data,
  seq_col = "CloneSeq",
  min_seq_length = 13,
  drop_matches = "GGGG",
  subset_cols =
    c("CloneSeq", "CloneFrequency", "SampleID"),
  verbose = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.