heavy_SIP: Heavy-SIP analysis
In HTSSIP: High Throughput Sequencing of Stable Isotope Probing Data Analysis

Description Usage Arguments Details Value Examples

Compare taxon abundances in 'heavy' fractions versus specific controls.

heavy_SIP(physeq, ex = "Substrate=='12C-Con'", rep = "Replicate",
  light_window = c(1.68, 1.7), heavy_window = c(1.73, 1.75),
  comparison = c("H", "H-v-L", "H-v-H"), hypo_test = c("binary",
  "t-test", "wilcox"), alternative = c("greater", "two.sided", "less"),
  sparsity_threshold = 0.1, sparsity_apply = c("all", "heavy"),
  padj_method = "BH")

`physeq`	A phyloseq object of just treatment vs control. If you have a more complicated experimental design, subset the phyloseq object into a list of treatment vs control comparisions.
`ex`	Expression for selecting controls based on metadata
`rep`	Column specifying gradient replicates. If the column does not exiset, then the column will be created, and all will be considered "replicate=1"
`light_window`	A vector designating the "light" BD window start and stop
`heavy_window`	A vector designating the "heavy" BD window start and stop
`comparison`	Which light/heavy BD windows to compare (see the description)?
`hypo_test`	Which hypothesis test to run on each OTU? Note that "binary" isn't really a hypothesis test, but just qualitative.
`alternative`	The "alternative" option for the hypothesis test functions. Note that "two.sided" doesn't work for the "binary" test.
`sparsity_threshold`	All OTUs observed in less than this portion (fraction: 0-1) of gradient fraction samples are pruned. A a form of indepedent filtering, The sparsity cutoff with the most rejected hypotheses is used.
`sparsity_apply`	Apply sparsity threshold to all gradient fraction samples ('all') or just heavy fraction samples ('heavy')
`padj_method`	Multiple hypothesis correction method. See 'p.adjust()' for more details.

'Heavy-SIP' encompasses the analyses often used in SIP studies prior to new HTS-SIP methodologies. These methods all consisted of identifying 'heavy' gradient fractions. This was often done by comparing the distribution of DNA conc. or gene copies across gradient fractions in labeled treatments versus unlabeled controls. Sometimes, the unlabeled control was left out, and "heavy" gradients were identified based on comparisons with theoretic distributions of unlabeled DNA.

Although hypothesis testing was often used to assess increased taxon abundances in "heavy" gradients of labled treatments (eg., one-tailed t-tests), the hypothesis testing usually did not account for the compositional nature of sequence data (relative abundances).

Here, "heavy-SIP" can define incorporators as either:

"H" = Any taxa IN the "heavy" fractions of the labeled treatment gradients
"H-v-L" = Any taxa IN the "heavy" fractions of the labeled treatment and NOT present in the "heavy" fractions of the control
"H-v-H" = Any taxa IN the "heavy" fractions of the labeled treatment and NOT present in the "light" fractions of the labeled treatment

Instead of binary comparisions (presence/absence), one-tailed t-tests or Wilcoxon Rank Sum tests can be used to assess differential abundance between "heavy" and controls. The hypothesis testing methods require multiple replicate controls, will use the mean taxon abundance in the "heavy" (and "light") window.

a data.frame object of hypothesis test results

data(physeq_S2D2)
data(physeq_rep3)
## Not run: 
# Calculating 'binary' for unreplicated experiment
## Subsetting phyloseq by Substrate and Day
params = get_treatment_params(physeq_S2D2, c('Substrate', 'Day'))
params = dplyr::filter(params, Substrate!='12C-Con')
ex = "(Substrate=='12C-Con' & Day=='${Day}') | (Substrate=='${Substrate}' & Day == '${Day}')"
physeq_S2D2_l = phyloseq_subset(physeq_S2D2, params, ex)

## Calculating heavy-SIP on 1 subset (use lapply function to process full list)
incorps = heavy_SIP(physeq_S2D2_l[[1]])

# Calculating wilcox test on replicated design
## (comparing heavy-treatment versus heavy-control)
incorps = heavy_SIP(physeq_rep3, ex="Treatment=='12C-Con'", comparison='H-v-H', hypo_test='wilcox')

## End(Not run)