freq_filter | R Documentation |
Filter taxonomic features in data based on frequency
freq_filter(data, min_freq = 0, filter_target = c(), tax_id = c())
data |
data.table object in long-format where each row represents a sequence feature in a given sequenced sample. |
min_freq |
Numeric value specifying the minimum acceptable frequency for a taxonomic feature within all replicates. |
filter_target |
Column name or group of columns specifying the ID variable for each sample-fraction. Column(s) should be able to uniquely differentiate every fraction from every replicate. |
tax_id |
Column name specifying a unique identifier for each sequence feature. |
There is no strong agreement among qSIP-users on the proper threshold to set for removing rare features. Rare and infrequent taxa produce noise in the data, making it hard to discern quality. The one guiding principle that there may be agreement on is that it’s best to set minimum filters at first – to be as inclusive as possible –- and intensify filters as needed to reduce noise.
One or more columns may be specified in the filter_target parameter, allowing for frequency filtering across treatment groups. My recommendation in this case is to make sure that all non-labeled samples (those where isotopic composition is at natural abundance) grouped together so that non-labeled buoyant density estimates may be made with as many occurrences as possible for each taxon.
Returns filtered data table with taxa above specified frequency thresholds
seq_summary
data(example_qsip)
# initial sequence and ASV count?
seq_summary(example_qsip, 'seq_abund', 'asv_id')
# Remove taxa that occur in fewer than 3 fractions in any given replicate
example_qsip <- freq_filter(example_qsip, 3, 'sampleID', 'asv_id')
seq_summary(example_qsip, 'seq_abund', 'asv_id')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.