ps_filter: Filter phyloseq samples by sample_data variables

View source: R/ps_filter.R

ps_filterR Documentation

Filter phyloseq samples by sample_data variables

Description

Keep only samples with sample_data matching one or more conditions. By default this function also removes taxa which never appear in any of the remaining samples, by running tax_filter(min_prevalence = 1). You can prevent this taxa filtering with .keep_all_taxa = TRUE.

Usage

ps_filter(ps, ..., .target = "sample_data", .keep_all_taxa = FALSE)

Arguments

ps

phyloseq object

...

passed directly to dplyr::filter (see examples and ?dplyr::filter)

.target

which slot of phyloseq to use for filtering by, currently only "sample_data" supported

.keep_all_taxa

if FALSE (the default), remove taxa which are no longer present in the dataset after filtering

Details

Use ps_filter as you would use use dplyr::filter(), but with a phyloseq object!

Value

phyloseq object (with filtered sample_data)

See Also

filter explains better how to give arguments to this function

tax_filter for filtering taxa (not samples)

Examples

library(phyloseq)
library(dplyr)

data("enterotype", package = "phyloseq")
enterotype
sample_data(enterotype)[1:10, 1:5]

# keep only samples with seqtech not equal to sanger
ps1 <- ps_filter(enterotype, SeqTech != "Sanger")
ps1
sample_data(ps1)[1:10, 1:5]

# keep only samples with no NAs in any variables
ps2 <- enterotype %>% ps_filter(!if_any(everything(), is.na))
ps2
sample_data(ps2)[1:8, 1:8]

# ps2 is equivalent to dropping samples with incomplete sample_variables and tax_filtering 0s
ps3 <- enterotype %>%
  ps_drop_incomplete() %>%
  tax_filter(undetected = 0, use_counts = FALSE)
# we needed to set a low detection threshold because this example data is proportions
identical(ps2, ps3) # TRUE

# function will give warning if some of the otu_values are negative
# (which may happen when filtering data that has e.g. clr-transformed taxa abundances)
# as it attempts to discard any taxa that become always absent/0 after filtering (by default)
# set .keep_all_taxa = TRUE to avoid this filtering behaviour, which is unwanted in this case
enterotype %>%
  tax_transform("clr") %>%
  ps_get() %>%
  ps_filter(SeqTech == "Sanger", .keep_all_taxa = TRUE)

david-barnett/microViz documentation built on April 17, 2025, 4:25 a.m.