calc_excess | R Documentation |
Calculates fractional isotope incorporation in excess of natural abundances
calc_excess(
data,
tax_id = c(),
sample_id = c(),
wads = "wad",
iso_trt = c(),
isotope = c(),
bootstrap = FALSE,
iters = 999L,
grouping_cols = c(),
min_freq = 3,
correction = TRUE,
rm_outliers = TRUE,
non_grower_prop = 0.1,
total_enrich = 1,
wad_to_gc_slope = 0.083506,
wad_to_gc_intercept = 1.646057,
nat_abund_13C = 0.01111233,
nat_abund_15N = 0.003663004,
nat_abund_18O = 0.002011429
)
data |
Data as a long-format data.table where each row represents a taxonomic feature within a single fraction.
Typically, this is the output from the |
tax_id |
Column name specifying unique identifier for each taxonomic feature. |
sample_id |
Column name specifying unique identifier for each replicate. |
wads |
Column name specifying weighted average density values. It is possible to change this if preferred but not recommended. |
iso_trt |
Column name specifying a two-level categorical column indicating whether a sample has been amended with a stable isotope (i.e., is "heavy") or if
isotopic composition is at natural abundance (i.e., "light").
Any terms may be applied but care should be taken for these values.
If supplied as a factor, |
isotope |
Column name specifying the isotope applied to each replicate. For "heavy" samples, these values should be one of "13C", "15N", or "18O". |
bootstrap |
Whether to generate bootstrapped enrichment values for each taxonomic feature across groups of samples.
If |
iters |
Integer specifying the number of bootstrap iterations to perform. |
grouping_cols |
Column(s) used to indicate bootstrap resampling groups. Within each group, replicates will be resampled with replacement. Resampling will not occur across groups. |
min_freq |
Minimum number of replicates a taxonomic feature must occur in to be kept. If treatment grouping columns are specified, frequencies will be assessed at this level. For unlabeled "light" samples, treatment groupings will be ignored when assessing adequate frequency of occurrence. |
correction |
Whether to apply a correction to fractional enrichment values to ensure a certain proportion are positive. |
non_grower_prop |
Fractional value applied if |
total_enrich |
Argument of length one describing the total enrichment value of the target isotope(s) achieved. The final fractional isotope enrichment calculations will be divided by this amount. A numeric value should be a positive number less than one and will be applied to every sample. A character value indicates a column with one or more total enrichment values, so that sample- or group-specific total enrichment values may be applied. |
wad_to_gc_slope |
Single numeric value indicating the slope of the relationship between WAD values and estimated genomic GC content. The default value represents that calculated by Hungate et al. 2015. |
wad_to_gc_intercept |
Single numeric value indicating the intercept of the relationship between WAD values and estimated genomic GC content. The default value represents that calculated by Hungate et al. 2015. |
nat_abund_13C |
Single numeric value indicating the default assumption for background 13C. See |
nat_abund_15N |
Single numeric value indicating the default assumption for background 15N See |
nat_abund_18O |
Single numeric value indicating the default assumption for background 18O. See |
rm_outlers |
Whether or not to remove low fractional enrichment values that are 1.5X lesser than the distance between the 25th quantile
and interquartile range.
If |
calc_excess
automatically averages the isotopically unamended WAD values for each taxonomic feature on the assumption that density values
will be identical (or nearly identical) for those samples.
The equations for calculating the molecular weights of taxon i, designated M_{Lab,i}
for labeled and M_{Light,i}
for
unlabeled, are:
M_{Light,i} = 0.496 \cdot G_{i} + 307.691
M_{Lab,i} = \left( \frac{\Delta W}{W_{Light,i}} + 1 \right) \cdot M_{Light,i}
Where
G_{i} = \frac{1}{0.083506} \cdot (W_{Light,i} - 1.646057)
Which indicates the GC content of taxon i based on the density of its DNA when unlabeled.
Note that the slope and intercept in this equation may be changed using the wad_to_gc_slope
, and wad_to_gc_intercept
parameters.
The calculation for the fractional of enrichment of taxon i, A_{i}
is:
A_{i} = \frac{M_{Lab,i} - M_{Light,i}}{M_{Heavymax,i} - M_{Light,i}} \cdot (1 - N_{x})
Where
N_{x}
: The natural abundance of heavy isotope. Default estimates are: N_{18O} = 0.002000429
, N_{13C} = 0.01111233
,
and N_{15N} = 0.003663004
M_{Heavymax,i}
: The highest theoretical molecular weight of taxon i assuming maximum labeling by the heavy isotope
M_{O,Heavymax,i} = (12.07747 + M_{Light,i}) \cdot L
M_{C,Heavymax,i} = (-0.4987282 \cdot G_{i} + 9.974564 + M_{Light,i}) \cdot L
M_{N,Heavymax,i} =( 0.5024851 \cdot G_{i} + 3.517396 + M_{Light,i}) \cdot L
L: The maximum label possible based off the percent of heavy isotope making up the atoms of that element in the labeled treatment
calc_excess
returns a data.table where each row represents a taxonomic feature within a single replicate.
The following columns are produced: excess atom fraction (eaf
). NA
values (usually when an organism is present in
only the unlabeled or labeled samples) are removed.
Because fraction-level data are being condensed to replicate-level, a list of grouping columns to keep is not necessary unless bootstrapping is specified, wherein those groupings will be used to organize the resampling effort.
Bootstrap functionality produces the columns: 2.5%
, 50%
, and 97.5%
reflecting the median bootstrapped enrichment value and the 95% confidence interval
of enrichment values.
In addition, probability that a taxon's enrichment was less than zero is represented
in the p_val
column which is based on the proportion sub-zero bootstrapped observations.
If iso_trt
is not a factor, then it will be coerced to one and it will
return a message telling the user which value was assigned to which isotope labeling level.
Most users adopt a scheme of using the terms "light" and "label" to define their
treatments but please note that the word "label" comes first in the alphabet and
will be assigned by the function to be the unlabeled or light treatment.
If you wish to avoid this (you almost certainly do), you must specify your factor levels beforehand.
Hungate, Bruce, et al. 2015. Quantitative microbial ecology through stable isotope probing. Applied and Environmental Microbiology 81: 7570 - 7581.
Morrissey, Ember, et al. 2018. Taxonomic patterns in nitrogen assimilation of soil prokaryotes. Environmental Microbiology 20: 1112 - 1119.
calc_wad, wad_wide
# Load in example data
data(example_qsip)
# relativize sequence abundances (should be done after taxonomic filtering)
example_qsip[, rel_abund := seq_abund / sum(seq_abund), by = sampleID]
# ensure that the "light" treatment is the first factor level in the isotope treatment column
levels(example_qsip$iso_trt)
# calculate weighted average densities
wads <- calc_wad(example_qsip,
tax_id = 'asv_id', sample_id = 'sampleID', frac_id = 'fraction',
frac_dens = 'Density.g.ml', frac_abund = 'avg_16S_g_soil',
rel_abund = 'rel_abund',
grouping_cols = c('treatment', 'isotope', 'iso_trt', 'Phylum'))
# calculate fractional enrichment in excess of background
eaf <- calc_excess(wads, tax_id = 'asv_id', sample_id = 'sampleID',
iso_trt = 'iso_trt', isotope = 'isotope')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.