phylosmith

Travis Build Status status

DOI

A supplementary package to build on the phyloseq package. Phyloseq objects are a great data-standard for microbiome and gene-expression data, this package is aimed to provied easy data-wrangling and visualization.

Installation

Requirements

Linux Systems

For some Linux systems you may need to install the following two programs through your terminal.

Ubuntu example:

sudo apt-get install libmysqlclient-dev libgdal-dev libudunits2-dev

These programs are required by some dependencies and may not come in your default OS distribution.

Windows Systems

if you are working on WINDOWS you likely need to install the CRAN program Rtools. When prompted, select add rtools to system PATH.

R

phylosmith depends on the usage of the phyloseq package released by Dr. Paul McMurdie. The package is maintained on BioConductor, and can be installed through R using the following commands:

if(!requireNamespace("BiocManager", quietly = TRUE)){
  install.packages("BiocManager")
} 
BiocManager::install("phyloseq")

Additionally, the package imports a number of other packages to use their advanced functions. These packages may install with the phylosmith installation, but it is always best to install independently.

install.packages(c("devtools", "RcppEigen", "RcppParallel", "Rtsne", "ggforce", "units"))

phylosmith

The package is hosted on Github, and can be installed through R with:

remotes::install_github('schuyler-smith/phylosmith')
library(phylosmith)

Functions

Wrangling

Call | Description -------------------- | ------------------------------------------------------------ conglomerate_samples | combines samples based on common factor within sample_data conglomerate_taxa | combines taxa that have same classification melt_phyloseq | melts a phyloseq object into a data.table merge_treatments | combines multiple columns in meta-data into a new column relative_abundance | transform abundance data to relative abundance set_sample_order | sets the order of the samples of a phyloseq object set_treatment_levels | sets the order of the factors in a sample_data column taxa_filter | filter taxa by proportion of samples seen in

Analytics

Call | Description -------------------- | ------------------------------------------------------------ common_taxa | find taxa common to each treatment taxa_core | filter taxa by proportion of samples and relative abundance taxa_proportions | computes the proportion of a taxa classification unique_taxa | find taxa unique to each treatment

Graphs

Abundance

Call | Description -------------------- | ------------------------------------------------------------ abundance_heatmap | create a ggplot object of the heatmaps of the abundance table abundance_lines | create a ggplot object of the abundance data as a line graph phylogeny_profile | create a ggplot barplot object of the compositons of each sample at a taxonomic level taxa_abundance_bars | create a ggplot object of the abundance of taxa in each sample taxa_core_graph | create a ggplot object of the core taxa over a range of parameters variable_correlation_heatmap | create a ggplot heatmatp of the correlation of numerical variables with taxa

Diversity

Call | Description -------------------- | ------------------------------------------------------------ alpha_diversity_graph | create a ggplot-object box-plot of the alpha-diversity from a phyloseq-object. nmds_phyloseq | create a ggplot object of the NMDS from a phyloseq object pcoa_phyloseq | create a ggplot object of the PCoA from a phyloseq object tsne_phyloseq | create a ggplot object of the t-SNE from a phyloseq object

Networks

Call | Description -------------------- | ------------------------------------------------------------ co_occurrence_network | creates a network of the co-occurrence of taxa network_layout_ps | creates a layout object for a network variable_correlation_network | creates a network of the correlation of taxa and sample variables

Calculations

Call | Description -------------------- | ------------------------------------------------------------ co_occurrence | calculate co-occurrence between taxa permute_rho | runs permutations of the otu_table to calculate a significant $\rho$ value histogram_permuted_rhos | Create a ggplot object of the distribution of rho values. quantile_permuted_rhos | calculate quantiles for the permuted rho values from the Spearman-rank co-occurrence variable_correlation | calculate the correlation of numerical variables with taxa abundances

Datasets

Originally I had created 2 mock phyloseq objects (mock_phyloseq and mock_phyloseq2) that had no real-world data but served to show simple examples of how the functions worked.

Then I decided that I should include a real example of microbiome data (soil_column) because it's always nice to see real examples. soil_column is a published dataset from my lab-group. The data is from an experiment where they looked at the microbial composition of farmland soil before and after manure application, over time, using 16S-sequencing.





schuyler-smith/phyloschuyler documentation built on March 27, 2024, 4:29 p.m.