Description Usage Arguments Details Value Author(s) Examples
A function to take any number of OTU tables (or other sequencing data tables), calculate taxa prevalence, relative abundance, and a CLR transformation, and finally merges clinical data
1 2 3 4 5 6 7 8 9 10 11 12 | tidy_micro(
otu_tabs,
clinical,
tab_names,
prev_cutoff = 0,
ra_cutoff = 0,
exclude_taxa = NULL,
library_name = "Lib",
complete_clinical = TRUE,
filter_summary = TRUE,
count_summary = TRUE
)
|
otu_tabs |
A single table or list of metagenomic sequencing data. Tables should have a first column of OTU Names and following columns of OTU counts. Column names should be sequencing library names |
clinical |
Sequencing level clinical data. Must have a column with unique names for library (sequencing ID) |
tab_names |
names for otu_tabs. These will become the "Tables" column. It is also an option to simply name the OTU tables in the list supplied to otu_tabs |
prev_cutoff |
A prevalence cutoff where *X* percent of libraries must have this taxa or it will be included in the "Other" category |
ra_cutoff |
A relative abundance (RA) cutoff where at least one library must have a RA above the cutoff or the taxa will be included in the "Other" category |
exclude_taxa |
A character vector used to specify any taxa that you would like to included in the "Other" category. Taxa specified will be included in "Other" for every OTU table provided |
library_name |
The column name containing sequencing library names. Should match with column names of supplied OTU tables (after first column) |
complete_clinical |
Logical; only include columns from OTU tables who's library name is in clinical data |
filter_summary |
Logical; print out summaries of filtering steps. Ignored |
count_summary |
Logical: print out summary of unique library names and sequencing depth |
Column names of the OTU tables must be the same for each table, and these should be the the library names inside of your clinical. Please see the vignette for a detailed description.
The CLR transformation adds (1 / sequencing depth) to each OTU count for each library before centering and log transforming in order to avoid issues with 0 counts.
The list of OTU tables are split, manipulated, and stacked into a data frame using the ldply
function from the plyr package. Names of OTU tables supplied will be the name of their "Table" in the final tidy_micro set
A data.frame in the tidy_micro format
Charlie Carpenter
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | data(bpd_phy); data(bpd_cla); data(bpd_ord); data(bpd_fam); data(bpd_clin)
## Multiple OTU tables with named list
otu_tabs = list(Phylum = bpd_phy, Class = bpd_cla,
Order = bpd_ord, Family = bpd_fam)
set <- tidy_micro(otu_tabs = otu_tabs, clinical = bpd_clin)
## Multiple OTU tables with unnamed list
unnamed_tabs <- list(bpd_phy, bpd_cla, bpd_ord, bpd_fam)
set <- tidy_micro(otu_tabs = unnamed_tabs,
tab_names = c("Phylum", "Class", "Order", "Family"), clinical = bpd_clin)
## Single OTU table
set <- tidy_micro(otu_tabs = bpd_cla, clinical = bpd_clin, tab_names = "Class")
## Filtering out low abundance or uninteresting taxa right away
## WARNING: Only do this if you do not want to calculate alpha diversities with this tidy_micro set
filter_set <- tidy_micro(otu_tabs = otu_tabs, clinical = bpd_clin,
prev_cutoff = 5, ## 5% of libraries must have this bug, or it is filtered
ra_cutoff = 1, ## At least 1 libraries must have RA of 1, or it is filtered
exclude_taxa = c("Unclassified", "Bacteria") ## Unclassified taxa we don't want
)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.