schtools::set_knitr_opts() knitr::opts_chunk$set( collapse = TRUE, echo = TRUE, comment = "#>" )
library(schtools) library(dplyr) library(ggplot2) library(readr) library(tidyr)
You can read a shared file and calculate relative abundances with calc_relabun()
:
shared_dat <- read_tsv(system.file("extdata", "test.shared", package = "schtools" )) relabun_dat <- shared_dat %>% calc_relabun() head(relabun_dat)
calc_relabun()
returns the data frame in long format.
You can use tidyr::pivot_wider()
to convert it to wide format:
wide_dat <- relabun_dat %>% pivot_wider(names_from = "otu", values_from = "rel_abun") head(wide_dat)
You can see that the relative abundances for each sample sum to 1:
wide_dat %>% select(starts_with("Otu")) %>% rowSums()
mothur formats taxonomy files as tab-separated values (tsv).
You can use read_tax()
to parse the taxonomy data and create separate columns
for each taxonomic level.
tax_dat <- read_tax(system.file("extdata", "test.taxonomy", package = "schtools" )) head(tax_dat)
The column label_html
provides html that correctly italicizes the genus name
without italicizing the OTU label. This can be used with
ggtext::element_markdown()
to make nice plots:
library(ggtext) set.seed(20220427) relabun_dat %>% mutate( sample_num = stringr::str_remove(sample, "p") %>% as.integer(), treatment = case_when( sample_num %% 2 == 1 ~ "A", TRUE ~ "B" ) ) %>% inner_join(tax_dat, by = "otu") %>% ggplot(aes(x = rel_abun, y = label_html, color = treatment)) + geom_jitter(alpha = 0.7, height = 0.2) + labs(x = "Relative abundance", y = "") + theme_minimal() + theme(axis.text.y = element_markdown())
A common task is to repeat OTU-level analyses at different taxonomic levels to determine which resolution is optimal for answering your questions. You'll need a shared file, generated from clustering sequences into OTUs with mothur, and a corresponding taxonomy file. Take a look at the mothur documentation for info on generating these files and performing microbiome analyses.
In this example, pool_taxon_counts()
pools the OTU counts in the shared file
at the genus level and returns new shared and taxonomy data frames.
tax_dat <- read_tax(system.file("extdata", "test.taxonomy", package = "schtools" )) shared_dat <- readr::read_tsv(system.file("extdata", "test.shared", package = "schtools" )) pool_taxon_counts(shared_dat, tax_dat, "genus")
You can do this for any taxonomic level in your taxonomy data frame.
pool_taxon_counts(shared_dat, tax_dat, "phylum")
If you have a distance file saved as a phylip-formatted lower triangle matrix
from mothur's dist.seqs
command,
you can read it into R with read_dist()
:
dist_filepath <- system.file("extdata", "sample.final.thetayc.0.03.lt.ave.dist", package = "schtools" ) dist_tbl <- read_dist(dist_filepath) head(dist_tbl)
When writing scientific papers with R Markdown, we often find ourselves
using the same knitr chunk options and miscellaneous helper functions.
To use our favorite options like eval=TRUE
, echo=FALSE
, and others,
run set_knitr_opts()
in the first chunk of your R Markdown document:
`r ''````r set_knitr_opts() ```
This also sets the inline hook to our custom inline_hook()
function, which
automatically formats numbers in a human-readable way and inserts an Oxford
comma into lists when needed.
When writing with R Markdown, you may wish to insert a list or vector
inline and correctly format it with an Oxford comma.
inline_hook()
uses paste_oxford_list()
to help you do just that!
animals <- c("cats", "dogs", "fish")
Insert the string as inline code with `r `
:
`r "\u0060r animals\u0060"`
are the most common pets.
Rendered output:
r paste_oxford_list(animals)
are the most common pets.
inline_hook()
uses format_numbers()
under the hood to automatically format
numbers to a human-readable format, rather than display in scientific notation.
The numbers
`r "\u0060r c(1e-04, 1e-05, 1e-06)\u0060"`
are very precise, while`r "\u0060r c(1e04, 1e05, 1e06)\u0060"`
are very large.
Rendered output:
The numbers
r inline_hook(c(1e-04, 1e-05, 1e-06))
are very precise. whiler inline_hook(c(1e04, 1e05, 1e06))
are very large.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.