otu_filter: A function to aggregate low prevalence, abundance, or...

Description Usage Arguments Details Value Author(s) Examples

View source: R/filter_funs.R

Description

Will take a tidi_micro set and aggregate the raw counts of taxa with a low prevalence and/or abundance into a new "Other" taxa. Can also find specific taxa you'd like to include in the "Other" taxa counts. Once the counts are aggregated taxa relative abundance, centered log ratio (CLR) transformations, and presence will be recalculated. This recalculation will only change the "Other" category

Usage

1
2
3
4
5
6
7
otu_filter(
  micro_set,
  prev_cutoff = 0,
  ra_cutoff = 0,
  exclude_taxa = NULL,
  filter_summary = T
)

Arguments

micro_set

A tidy_micro data set

prev_cutoff

Minimum percent of subjects with OTU counts above 0

ra_cutoff

At leat one subject must have RA above this subject

exclude_taxa

A character vector of OTU names that you would like filter into your "Other" category

filter_summary

Logical; print out summaries of filtering steps

Details

1/Total will be added to each taxa count for CLR tranformations in order to avoid issues with log(0)

Value

Returns a tidy_micro set

Author(s)

Charlie Carpenter and Dan Frank

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(bpd_phy); data(bpd_cla); data(bpd_ord); data(bpd_fam); data(bpd_clin)

otu_tabs <- list(Phylum = bpd_phy, Class = bpd_cla,
Order = bpd_ord, Family = bpd_fam)
set <- tidy_micro(otu_tabs = otu_tabs, clinical = bpd_clin) %>%
filter(day == 7) ## Only including the first week

filter_set <- set %>%
otu_filter(prev_cutoff = 5, ## 5% of subjects must have this bug, or it is filtered
  ra_cutoff = 1, ## At least 1 subject must have RA of 1, or it is filtered
  exclude_taxa = c("Unclassified", "Bacteria") ## Unclassified taxa we don't want
)

Example output

Loading required package: tidyverse
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
✔ ggplot2 3.3.2purrr   0.3.4tibble  3.0.4dplyr   1.0.2tidyr   1.1.2stringr 1.4.0readr   1.4.0forcats 0.5.0
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()dplyr::lag()    masks stats::lag()
Contains 74 libraries from OTU files.

Summary of sequencing depth:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   8851   24938   33314   36650   43590   97408 
Filter for Class counts


Found 'Unclassified' category in input data.

Created new 'Other' category.

Found 'Bacteria' category in input data.

Found 34 OTUs.

Collapsed 2 OTUs into 'Other' in OTU table.

Converted 61662 counts to 'Other' otu category.

Remaining OTUs: 33 (Including 'Other').


Prevalence cutoff: 5% (i.e., at least 5% of libaries must be represented to keep OTU)

Found 33 OTUs.

Found 'Other' category in input data.

Collapsed 13 OTUs into 'Other' in OTU table.

Converted 1 counts to 'Other' in otu category.

Remaining OTUs: 20 (Including 'Other').


Relative abundance cutoff: 1% (i.e., at least one library must have RA > 1% to keep OTU).

Found 20 OTUs.

Found 'Other' category in input data.

Collapsed 15 OTUs into 'Other' in OTU table.

Converted 2879 counts to 'Other' otu category.

Remaining OTUs: 5 (Including 'Other').


Filter for Family counts


Found 'Unclassified' category in input data.

Created new 'Other' category.

Found 'Bacteria' category in input data.

Found 116 OTUs.

Collapsed 2 OTUs into 'Other' in OTU table.

Converted 61662 counts to 'Other' otu category.

Remaining OTUs: 115 (Including 'Other').


Prevalence cutoff: 5% (i.e., at least 5% of libaries must be represented to keep OTU)

Found 115 OTUs.

Found 'Other' category in input data.

Collapsed 59 OTUs into 'Other' in OTU table.

Converted 24 counts to 'Other' in otu category.

Remaining OTUs: 56 (Including 'Other').


Relative abundance cutoff: 1% (i.e., at least one library must have RA > 1% to keep OTU).

Found 56 OTUs.

Found 'Other' category in input data.

Collapsed 47 OTUs into 'Other' in OTU table.

Converted 3845 counts to 'Other' otu category.

Remaining OTUs: 9 (Including 'Other').


Filter for Order counts


Found 'Unclassified' category in input data.

Created new 'Other' category.

Found 'Bacteria' category in input data.

Found 62 OTUs.

Collapsed 2 OTUs into 'Other' in OTU table.

Converted 61662 counts to 'Other' otu category.

Remaining OTUs: 61 (Including 'Other').


Prevalence cutoff: 5% (i.e., at least 5% of libaries must be represented to keep OTU)

Found 61 OTUs.

Found 'Other' category in input data.

Collapsed 25 OTUs into 'Other' in OTU table.

Converted 10 counts to 'Other' in otu category.

Remaining OTUs: 36 (Including 'Other').


Relative abundance cutoff: 1% (i.e., at least one library must have RA > 1% to keep OTU).

Found 36 OTUs.

Found 'Other' category in input data.

Collapsed 29 OTUs into 'Other' in OTU table.

Converted 3398 counts to 'Other' otu category.

Remaining OTUs: 7 (Including 'Other').


Filter for Phylum counts


Found 'Unclassified' category in input data.

Created new 'Other' category.

Found 'Bacteria' category in input data.

Found 15 OTUs.

Collapsed 2 OTUs into 'Other' in OTU table.

Converted 61662 counts to 'Other' otu category.

Remaining OTUs: 14 (Including 'Other').


Prevalence cutoff: 5% (i.e., at least 5% of libaries must be represented to keep OTU)

Found 14 OTUs.

Found 'Other' category in input data.

Collapsed 3 OTUs into 'Other' in OTU table.

Converted 0 counts to 'Other' in otu category.

Remaining OTUs: 11 (Including 'Other').


Relative abundance cutoff: 1% (i.e., at least one library must have RA > 1% to keep OTU).

Found 11 OTUs.

Found 'Other' category in input data.

Collapsed 6 OTUs into 'Other' in OTU table.

Converted 1787 counts to 'Other' otu category.

Remaining OTUs: 5 (Including 'Other').

tidyMicro documentation built on Jan. 13, 2021, 6:18 a.m.