In healyke/DEsiR: Stable Isotope Discrimination Estimation using R

Load the required packages:

library(SIDER)
library(MixSIAR)
library(dplyr)
library(ggplot2)

# knitr::opts_chunk$set(eval = FALSE)

N.B. to run this example you will need to download and install the jags software which is separate to R and separate to the R package rjags. This is required for MixSIAR to run. You can find it at http://mcmc-jags.sourceforge.net

In this example we illustrate how to include a SIDER analysis into a workflow for a mixing model anlaysis using MixSIAR. The section Use SIDER to estimate TDFs (discrimination factors) illustrates how to run SIDER and then use the estimated TDFs in a subsequent mixing model analysis using MixSIAR. The rest of the document is not required for such an analysis, and instead it serves as a comparison of alternative methods, that we would like to see consigned ultimately to history!

Throughout this example we run relatively short runs of SIDER which results in estimates that do not pass the convergence tests. This is done to keep the build time for the package vignettes to something reasonable. As per the introduction vignette, we recommend longer runs, which may take up to an hour or even more, to produce the definitive values you would used subsequently in any analysis.

We will take a single pack from the Wolves example in MixSIAR Semmens et al 2009 which is based on a study by Darimont et al 2008. The SIDER dataset does not contain a TDF for wolf hair (the tissue type sampled in that study) nor indeed any wolf tissue sample. For the original analysis by Darimont et al 2008 and subsequently used by Semmens et al 2009 to present a new heirarchical SIMM, the authors used TDFs from fox hair a nearest species neighbour approximation using data from Roth & Hobson 2000. We compare the effects on the SIMM estimates of diet using this fox TDF with a more recently derived TDF for wolf hair that is not included in SIDER owing to it not being a directly observed TDF from a feeding experiment and the TDFs estimated using SIDER. Using a nearest species neighbour is a commonly adopted approach by researchers when their species does not have a published TDF. We will argue later that researchers should use SIDER to esimate their TDF in nearly all cases even if species-specific data are available.

The instructions to run MixSIAR are taken from the manual for that package, and further information and explanation is included in the MixSIAR package documentation.

First we load the mixture data from a *csv file. Here we are using only a single pack from the mainland population of wolves who were found to consumre more deer than marine mammals or salmon.

mix.filename <- "../inst/extdata/wolves_consumer_mainland.csv"

# Load the mixture/consumer data
mix <- load_mix_data(filename = mix.filename, 
                     iso_names = c("d13C","d15N"), 
                     factors = NULL, 
                     fac_random = NULL, 
                     fac_nested = NULL, 
                     cont_effects = NULL)

Now read in the source data.

# Replace the system.file call with the path to your file
# source.filename <- system.file("extdata", "wolves_sources.csv", package = "SIDER")
source_filename <- "../inst/extdata/wolves_sources.csv"

# Load the source data
source <- load_source_data(filename = source_filename,
                           source_factors = NULL, 
                           conc_dep = FALSE, 
                           data_type = "means", 
                           mix)

Construct the jags model for this analysis using write_JAGS_model().

# Write the JAGS model file
model_filename <- "MixSIAR_model.txt"   # Name of the JAGS model file
resid_err <- FALSE
process_err <- TRUE

# this writes a jags model as a *txt file to your working directory
# which is used later to implement the model.
write_JAGS_model(model_filename, resid_err, process_err, mix, source)

Choose TDFs from the nearest species neighbour

The nearest species to the wolf in the dataset with known TDFs is the fox (Roth & Hobson 2000). Following the procedure adopted by both Darimont et al 2008 and Semmens et al 2009 we use only the means and set their variances to 0.

# extract the fox data using dplyr::filter() 
fox_data <- SIDER::isotope_data %>% filter(species == "Vulpes_vulpes", 
                                           tissue == "hair")


# create the fox_tdfs list and populate it with matrices for the means and sd
fox_tdfs <- list()

# matrix of means
fox_tdfs$mu <- matrix(0, ncol = 2, nrow = 3,
                       dimnames = list(source$source_names, mix$MU_names))

# matrix of standard deviations
fox_tdfs$sig2 <- matrix(0, ncol = 2, nrow = 3,
                       dimnames = list(source$source_names, mix$SIG_names))

# add in the means for the fox tdfs that are in the SIDER dataset
fox_tdfs$mu[,"Meand13C"] <- fox_data$delta13C
fox_tdfs$mu[,"Meand15N"] <- fox_data$delta15N


jags_fox <- run_model(run="short", mix, source, fox_tdfs, model_filename, 
                    alpha.prior = 1, resid_err, process_err)

Use TDFs from an experimental study

Although not from a controlled feeding experiment, Derbridge et al (2015) inferred TDFs for wolves of $\Delta^{13}\text{C} = 1.97$ ‰ and $\Delta^{15}\text{N} = 3.04$ ‰ using a modified Stable Isotope Mixing Model in which the TDFs themselves were estimated.

derbridge_tdfs <- fox_tdfs
derbridge_tdfs$mu[,"Meand13C"] <- 1.97
derbridge_tdfs$mu[,"Meand15N"] <- 3.04

derbridge_tdfs$sig2[,"SDd13C"] <- 0.70
derbridge_tdfs$sig2[,"SDd15N"] <- 0.31


jags_derbridge <- run_model(run="short", mix, source, derbridge_tdfs, model_filename, 
                    alpha.prior = 1, resid_err, process_err)

Use SIDER to estimate TDFs (discrimination factors)

The wolf isotope samples are described fully in Darimont, C.T., Papquet, P.C. and Reimchen, T.E. (2009), Landscape heterogeneity and marine subsidy generate exensive intrapopulation niche diversity in a large terrestrial vertebrate. Journal of Animal Ecology, 78: 126-133. doi.

# Read in the data
SIDER_data <- scrumpSider(iso.data = "all")

##all trees
combined_trees <- scrumpSider(tree = "all")



# Checking the data for the species we want to estimate TEF for (Canis lupus)
new_data_test <- recipeSider(species = "Canis_lupus", 
                             habitat = "terrestrial", 
                             taxonomic.class = "mammalia", 
                             tissue = "hair", 
                             diet.type = "carnivore", 
                             tree = combined_trees)

# prepare the carbon model
tdf_data_c <- prepareSider(new_data_test, 
                          isotope_data, 
                          combined_trees, 
                          "carbon")

# prepare the nitrogen model
tdf_data_n <- prepareSider(new_data_test, 
                          isotope_data, 
                          combined_trees, 
                          "nitrogen")

# formulae for both
formula_c <- delta13C ~ diet.type + habitat
formula_n <- delta15N ~ diet.type + habitat

# common random structure for both
random_terms <- ( ~ animal + species + tissue)

prior <- list(R = list(V = 1, nu=0.002), 
              G = list(G1=list(V = 1, nu=0.002),
                       G2=list(V = 1, nu=0.002), 
                       G3=list(V = 1, nu=0.002)))

# model run settings
# *Might* want these to be longer runs.
nitt   <- c(120000)
burnin <- c(20000)
thin   <- c(50)
parameters <- c(nitt, thin, burnin)
n_chains <- c(2)

# convergence settings
convergence =  c(1.1)
ESS = c(1000)


TDF_est_c <- imputeSider(mulTree.data = tdf_data_c, 
                         formula = formula_c, 
                         random.terms = random_terms,
                         prior = prior, 
                         output = "wolves_sider_c_run",
                         parameters = parameters,
                         chains = n_chains, 
                         convergence =  convergence, 
                         ESS = ESS)

TDF_est_n <- imputeSider(mulTree.data = tdf_data_n, 
                         formula = formula_n, 
                         random.terms = random_terms,
                         prior = prior, 
                         output = "wolves_sider_n_run",
                         parameters = parameters,
                         chains = n_chains, 
                         convergence =  convergence, 
                         ESS = ESS)


###Now lets have a look at the files imputeSider has saved to the current working directory
# list.files(pattern = "test_c_run")

Take a look at the summaries of the SIDER runs. First $\Delta^{13}$C.

# Calculate summary statistics of the posterior. 
# Specifically, the mean and standard deviation would be
# taken from here and used in a mixing model analysis using 
# MixSIAR, MixSIR or SIAR for example.
summary(TDF_est_c$tdf_global)

# Credible intervals and the mode of the posterior are obtained 
# using the hdrcde package
hdrcde::hdr(TDF_est_c$tdf_global, prob = c(50, 95, 99))

# You can also create density plots of the posterior
coda::densplot(TDF_est_c$tdf_global)

Second, $\Delta^{15}$N.

# Calculate summary statistics of the posterior. 
# Specifically, the mean and standard deviation would be
# taken from here and used in a mixing model analysis using 
# MixSIAR, MixSIR or SIAR for example.
summary(TDF_est_n$tdf_global)


# Credible intervals and the mode of the posterior are obtained 
# using the hdrcde package
hdrcde::hdr(TDF_est_n$tdf_global, prob = c(50, 95, 99))

# You can also create density plots of the posterior
coda::densplot(TDF_est_n$tdf_global)

Now incorporate the new TDF data in a mixsiar model.

# means and variances of the SIDER estimates
mu_c <- mean(TDF_est_c$tdf_global)
mu_n <- mean(TDF_est_n$tdf_global)

sig2_c <- var(TDF_est_c$tdf_global)
sig2_n <- var(TDF_est_n$tdf_global)

# construct the list object containing the mean and sd of the TDF data 
# to pass to MixSIAR. This is a list of length 2, each containing a matrix 
# with row.names of the sources, and specific column names that match the 
# source file. The code below extracts these from the loaded source and mixture 
# objects created early in the first section of this document via 
# load_mix_data() and load_source_data()

# construct the list object
sider_tdf <- list()

# create the matrix of mean TDFs
sider_tdf$mu <- matrix(c(rep(mu_c, 3), rep(mu_n, 3)), ncol = 2, nrow = 3,
                       dimnames = list(source$source_names, mix$MU_names))

# create the matrix of variances for the TDFs
sider_tdf$sig2 <- matrix(c(rep(sig2_c, 3), rep(sig2_n, 3)), ncol = 2, nrow = 3,
                       dimnames = list(source$source_names, mix$SIG_names))

And finally run mixsiar

jags_sider <- run_model(run="short", mix, source, sider_tdf, model_filename, 
                    alpha.prior = 1, resid_err, process_err)

Compare model estimates

cat("Model 1: use the fox TDFs\n")
knitr::kable(jags_fox$BUGSoutput$summary)

cat("Model 2: Inferred wolf TDFs from Derbridge paper\n")
knitr::kable(jags_derbridge$BUGSoutput$summary)

cat("Model 3ehad(post: SIDER derived TDFs\n")
knitr::kable(jags_sider$BUGSoutput$summary)

Bundle all the posteriors together using dplyr for plotting with ggplot. Here we focus on the proportion of deer in the diet of this wolf pack, being the largest constituent of the diet. As we can see, adding ad hoc uncertainty to the TDFs used in the original model widens the estimated proportion of deer in the diet and slightly lowers the mean estiamte (c.f. "Original" with "Added Variation" models). Using the fox as the nearest species in the SIDER dataset along with some ad hoc variation has little effect in this instance since their TDFs are similar to those used in the original model. The TDFs inferred by Derbridge et al (2015) differ from the "original" model primarily in having a higher $\Delta^{15}\text{N}$ of 3.04 ‰ compared with 2.60 ‰ and consequently the estimated proportion of deer in the diet is higher. Although the SIDER estimated $\Delta^{15}\text{N}$ is higher still at 3.32 (the $\Delta^{13}\text{C}$ is more similar) the additional uncertainty on these estimates widens and flattens the estimated proportion of deer in the diet, and also brings the esimate away from the boundary at 1, and closer to the prior which in this case is $\approx 0.33$.

We did not include the Derbridge et al (2015) TDFs in the SIDER dataset as they did not conform to our criteria as they are not derived directly from a controlled feeding trial, and instead are inferred indirectly using a modified mixing model approach. In any case, we would argue that unless you have a TDF for your specific population of consumers, and you are satisfied that they best represent their physiology and diet, then SIDER should be used to estimate the TDFs for your study. An additional option is to add the extra data to the SIDER dataset yourself and re-run the analysis to impute a wolf TDF which will now draw additional information from the wolf data in the dataset already. We provided an associated vignette illustrating how to this in SIDER.

post_df <- dplyr::bind_rows(list(as.data.frame(jags_fox$BUGSoutput$sims.matrix), 
                                 as.data.frame(jags_derbridge$BUGSoutput$sims.matrix),
                                 as.data.frame(jags_sider$BUGSoutput$sims.matrix))
                            , .id = "Model")

# change the names to match the sources
names(post_df) <- c("Model", "Deviance", source$source_names)

# convert to numeric and then factor
post_df$Model <- as.numeric(post_df$Model)
post_df$Model <- factor(post_df$Model, 
                        labels = c("Fox", 
                                   "Derbridge", "SIDER"))

# plot the Deer.
p1 <- ggplot(post_df, mapping = aes(x = Model, y = Deer)) + 
  geom_violin(draw_quantiles = c(0.025, 0.25, 0.5, 0.75, 0.975)) + theme_classic()
print(p1)

Figure 3. Violin plot of the estimate proportions of Deer in the diet for the 3 alternative models. Horizontal lines show the median at the middle, and then the interquartile range and 95% credible intervals.

healyke/DEsiR documentation built on April 19, 2020, 7:14 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com