View source: R/filter_feature.R
filter_feature | R Documentation |
filter out reads based on cutoff threshold and asv prevalence across samples
filter_feature(
count_df,
tax_df,
filter_method = "abs_count",
asv_cutoff = 1,
prev_cutoff = 2
)
count_df |
dataframe. count table with samples in columns and ASV in rows. feature ID in rownames. |
tax_df |
dataframe. featureID must match rownames |
filter_method |
default |
asv_cutoff |
cutoff used to filter sequences. features are kept when they are greater than this cutoff |
prev_cutoff |
prevalence cutoff. ASVs must reach the |
Filtering is performed based on read count and sample prevalence.
ASVs are kept if they pass the ASV count cut-off OR if they pass
the sample prevalence cut-off.
asv_cutoff = 'abs_count'
uses a read count as a threshold cutoff.
recommended default of 1
When asv_cutoff
is set to 'percent_sample'
uses percent of
sample total read count as the threshold cutoff.
Therefore, ASVs must reach a certain percentage of a given sample.
Recommended default of 0.01
for 0.01% of each sample
When asv_cutoff
is set to 'percent_dataset'
uses percent
of dataset total read count as the threshold cutoff.
Recommended default of 0.01
for 0.01% of entire dataset
prev_cutoff
has minimum value of 1 (sequence must reach cutof in
at least 1 sample, which would not filter out any sequences).
Default value is set to 2, which is the most relaxed cutoff
A recommended default is the number of samples to make up 5% of total number of samples.
list of: filtered_table - filtered Also returns list of: \code{p_agg} - plot of sequences removed/kept based on relative abundance vs asv prevalence in aggregated (mean ASV relative abundance) \code{p_exp} - expanded view (ASV relative abundance for every sample shown). \code{feat_keep} - vector of ASVs remaining after filtering \code{feat_remove} - vector of ASVs removed during filtering
data(dss_example)
# put featureID as rownames
tax_df <- dss_example$merged_taxonomy
count_df <- dss_example$merged_abundance_id %>%
column_to_rownames('featureID')
# set features in count tax to be in same order
count_df <- count_df[tax_df$featureID,]
filtered_ls <- filter_feature(count_df, tax_df, 'percent_sample', 0.001, 2)
summary(filtered_ls)
filtered_count <- filtered_ls$filtered
dim(filtered_count)
head(filtered_count)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.