phyloseq_filter_prevalence: Filter low-prevalence OTUs.

View source: R/phyloseq_filter.R

phyloseq_filter_prevalenceR Documentation

Filter low-prevalence OTUs.

Description

This function will remove taxa (OTUs) with low prevalence, where prevalence is the fraction of total samples in which an OTU is observed.

Usage

phyloseq_filter_prevalence(
  physeq,
  prev.trh = 0.05,
  abund.trh = NULL,
  threshold_condition = "OR",
  abund.type = "total"
)

Arguments

physeq

A phyloseq-class object

prev.trh

Prevalence threshold (default, 0.05 = 5% of samples)

abund.trh

Abundance threshold (default, NULL)

threshold_condition

Indicates type of prevalence and abundance conditions, can be "OR" (default) or "AND"

abund.type

Character string indicating which type of OTU abundance to take into account for filtering ("total", "mean", or "median")

Details

Abundance threshold defines if the OTU should be preserved if its abundance is larger than threshold (e.g., >= 50 reads). Parameter "threshold_condition" indicates whether OTU should be kept if it occurs in many samples AND/OR it has high abundance.

Value

Phyloseq object with a subset of taxa.

See Also

phyloseq_prevalence_plot

Examples

data(GlobalPatterns)
GlobalPatterns  # 19216 taxa

# OTUs that are found in at least 5% of samples
phyloseq_filter_prevalence(GlobalPatterns, prev.trh = 0.05, abund.trh = NULL)  # 15389 taxa

# The same, but if total OTU abundance is >= 10 reads it'll be preserved too
phyloseq_filter_prevalence(GlobalPatterns, prev.trh = 0.05, abund.trh = 10, threshold_condition = "OR")  # 15639 taxa

# Include only taxa with more than 10 reads (on average) in at least 10% samples
phyloseq_filter_prevalence(GlobalPatterns, prev.trh = 0.1, abund.trh = 10, abund.type = "mean", threshold_condition = "AND")  # 4250 taxa


vmikk/metagMisc documentation built on June 20, 2024, 7:20 a.m.