View source: R/filter_feature.R
| filter_feature | R Documentation |
filter out reads based on cutoff threshold and asv prevalence across samples
filter_feature(
count_df,
tax_df,
filter_method = "abs_count",
asv_cutoff = 1,
prev_cutoff = 2
)
count_df |
dataframe. count table with samples in columns and ASV in rows. feature ID in rownames. |
tax_df |
dataframe. featureID must match rownames |
filter_method |
default |
asv_cutoff |
cutoff used to filter sequences. features are kept when they are greater than this cutoff |
prev_cutoff |
prevalence cutoff. ASVs must reach the |
Filtering is performed based on read count and sample prevalence.
ASVs are kept if they pass the ASV count cut-off OR if they pass
the sample prevalence cut-off.
asv_cutoff = 'abs_count' uses a read count as a threshold cutoff.
recommended default of 1
When asv_cutoff is set to 'percent_sample' uses percent of
sample total read count as the threshold cutoff.
Therefore, ASVs must reach a certain percentage of a given sample.
Recommended default of 0.01 for 0.01% of each sample
When asv_cutoff is set to 'percent_dataset' uses percent
of dataset total read count as the threshold cutoff.
Recommended default of 0.01 for 0.01% of entire dataset
prev_cutoff has minimum value of 1 (sequence must reach cutof in
at least 1 sample, which would not filter out any sequences).
Default value is set to 2, which is the most relaxed cutoff
A recommended default is the number of samples to make up 5% of total number of samples.
list of:
filtered_table - filtered
Also returns list of:
\code{p_agg} - plot of sequences removed/kept based on relative abundance
vs asv prevalence in aggregated (mean ASV relative abundance)
\code{p_exp} - expanded view (ASV relative abundance for every sample shown).
\code{feat_keep} - vector of ASVs remaining after filtering
\code{feat_remove} - vector of ASVs removed during filtering
data(dss_example)
# put featureID as rownames
tax_df <- dss_example$merged_taxonomy
count_df <- dss_example$merged_abundance_id %>%
column_to_rownames('featureID')
# set features in count tax to be in same order
count_df <- count_df[tax_df$featureID,]
filtered_ls <- filter_feature(count_df, tax_df, 'percent_sample', 0.001, 2)
summary(filtered_ls)
filtered_count <- filtered_ls$filtered
dim(filtered_count)
head(filtered_count)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.