knitr::opts_chunk$set(fig.width=8, fig.height=4, cache = TRUE)
library(phylosmith)
data(soil_column)

Examples used in this vignette will use the GlobalPatterns dataset from phyloseq.

library(phyloseq)
data(GlobalPatterns)


conglomerate_samples

Merges samples within a phyloseq-class object which match on the given criteria (treatment). Any sample_data factors that do not match will be set to NA. otu_table counts will be reassigned as the mean of all the samples that are merged together.

Use this with caution as replicate samples may be crucial to the experimental design and should be proven statistically to be similar enough to combine for downstream analysis.


Usage

conglomerate_samples(phyloseq_obj, treatment, subset = NULL)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data. subset | A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on.

Examples

phyloseq::sample_sums(GlobalPatterns)
conglomerated <- conglomerate_samples(GlobalPatterns, treatment = 'SampleType')
phyloseq::sample_sums(conglomerated)




conglomerate_taxa

A re-write of the phyloseq::tax_glom(). This iteration runs faster with the implementation of data.table.


Usage

conglomerate_taxa(phyloseq_obj, classification, hierarchical = TRUE)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. classification | Column name as a string in the tax_table for the factor to conglomerate by. hierarchical | Whether the order of factors in the tax_table represent a decreasing hierarchy (TRUE) or are independant (FALSE). If FALSE, will only return the factor given by classification.

Examples

conglomerate_taxa(GlobalPatterns, classification = 'Phylum', hierarchical = TRUE)




melt_phyloseq

Converts the otu_table, tax_table, and sam_data to a 2-dimensional data.table.


Usage

melt_phyloseq(phyloseq_obj)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object.

Examples

melt_phyloseq(GlobalPatterns)




merge_treatments

Combines multiple columns from the sample-data into a single column. Doing this can make it easier to subset and look at the data on multiple factors.


Usage

merge_treatments(phyloseq_obj, ...)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. It must contain sample_data() with information about each sample. treatment | Column name as a string, or vector of, in the sample_data.

Examples

merge_treatments(GlobalPatterns, c('Final_Barcode', 'Barcode_truncated_plus_T'))




set_sample_order

Arranged the phyloseq object so that the samples are listed in a given order, or sorted on metadata. This is most useful for visual inspection of the metadata, and having the samples presented in a correct order in ggplot2 figures.

Usage

set_sample_order(phyloseq_obj, treatment)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data.

Examples

phyloseq::sample_names(GlobalPatterns)
ordered_obj <- set_sample_order(GlobalPatterns, "SampleType")
phyloseq::sample_names(ordered_obj)



set_treatment_levels

Set the order of the levels of a factor in the sample-data. Primarily useful for easy formatting of the order that ggplot2 will display samples.

Useful for:

Usage

set_treatment_levels(phyloseq_obj, treatment, order)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data. order | The order of factors in treatment column as a vector of strings. If assigned "numeric" will set ascending numerical order.

Examples

levels(soil_column@sam_data$Day)
ordered_days <- set_treatment_levels(soil_column, 'Day', 'numeric')
levels(ordered_days@sam_data$Day)



taxa_extract

Create a new phyloseq-object containing defined taxa. Taxa names can be a substring or entire taxa name. It will match that string in all taxa levels unless a specific classification level is declared.

Useful for:

Usage

taxa_extract(phyloseq_obj, taxa_to_extract, classification = NULL)


Arguments

Call | Description -------------------- | --------------------------------------------------------- phyloseq_obj | A phyloseq-class object. taxa_to_extract | A string, or vector of taxa of interest. classification | Column name as a string in the tax_table for the factor to conglomerate by.

Examples

GlobalPatterns
taxa_extract(GlobalPatterns, c("Cyano", "Proteo","Actinobacteria"))




taxa_filter

This is a robust function that is implemented in nearly every other function of this package. It uses many of the subsetting processes distributed within phyloseq, but strives to make them a more user-friendly and combined into a one-stop function. The function works in several steps.

If frequency is set to 0 (default), then the function removes any taxa with no abundance in any sample.

Useful for:

Usage

taxa_filter(phyloseq_obj, treatment = NULL, subset = NULL, frequency = 0, below = FALSE, drop_samples = FALSE)


Arguments

Call | Description -------------------- | ------------------------------------------------------------ phyloseq_obj | A phyloseq-class object. treatment | Column name as a string, or vector of, in the sample_data. subset | A factor within the treatment. This will remove any samples that to not contain this factor. This can be a vector of multiple factors to subset on. frequency | The proportion of samples the taxa is found in. below | Does frequency define the minimum (FALSE) or maximum (TRUE) proportion of samples the taxa is found in. drop_samples | Should the function remove samples that that are empty after removing taxa filtered by frequency (TRUE).

Examples The soil_column data has 19,216 OTUs listed in its taxa_table.

GlobalPatterns

However, 228 of those taxa are not actually seen in any of the samples.

length(phyloseq::taxa_sums(GlobalPatterns)[phyloseq::taxa_sums(GlobalPatterns) == 0])

taxa_filter with frequency = 0 will remove those taxa.

taxa_filter(GlobalPatterns, frequency = 0)

Say that we wanted to only look at taxa that are seen in 80% of the samples.

taxa_filter(GlobalPatterns, frequency = 0.80)

But if we want taxa that are seen in 80% of any 1 teatment group;

taxa_filter(GlobalPatterns, frequency = 0.80, treatment = 'SampleType')

It returns a larger number of taxa, since they need to be seen in less samples overall.



taxa_prune

Create a new phyloseq-object ommitting the defined taxa. Taxa names can be a substring or entire taxa name. It will match that string in all taxa levels unless a specific classification level is declared.

Useful for:

Usage

taxa_prune(phyloseq_obj, taxa_to_remove, classification = NULL)


Arguments

Call | Description -------------------- | --------------------------------------------------------- phyloseq_obj | A phyloseq-class object. taxa_to_remove | A string, or vector of taxa to remove. classification | Column name as a string in the tax_table for the factor to conglomerate by.

Examples

GlobalPatterns
taxa_prune(GlobalPatterns, c("Cyano", "Proteo","Actinobacteria"))





schuyler-smith/phyloschuyler documentation built on March 27, 2024, 4:29 p.m.