Nothing
#' Null models for ecological networks
#'
#' Uses the algorithm of Agusti \emph{et al}. (2003) to specify and run a null
#' model for an ecological network based upon interaction data and
#' independent estimates of resource abundance (see Vaughan \emph{et al}.,
#' 2018 for full details). Typically, network nodes represent species,
#' and the following documentation uses the term species in place of node,
#' but this need not be the case.
#'
#' @param consumers A matrix or data frame containing the interaction data. The
#' first column should contain the name of the consumer species, with the
#' remaining column names listing the resources (the names must be identical
#' and in the same order as in \code{resources}). Each row represents one
#' individual, with the elements in the matrix indicating whether a resource
#' was used or how much was used (see Details).
#' @param resources A matrix or data frame containing the relative abundances of
#' the different resources (e.g. density of different prey species or abundance
#' of different flower species). May either have one row, if all data came
#' from the same location and/or time, or have the same number of rows as
#' there are sampling stratum codes in \code{r.samples} and \code{c.samples}
#' (e.g. the set of time points or plot names), if the data are subdivided
#' across sampling times/sites. Resource names (column names) in
#' \code{resources} and \code{consumers} must be identical and in the same
#' order.
#' @param sims Number of iterations of the null model. The default value is 100,
#' but this should generally be increased to estimate meaningful confidence
#' limits.
#' @param data.type An optional string specifying the type of interaction data.
#' One of three options: \code{"names"} (the default), \code{"counts"} or
#' \code{"quantities"}. See Details for a full explanation and examples.
#' @param maintain.d When \code{data.type = "counts"} or \code{"quantities"},
#' a string indicating whether the degree of each individual consumer (i.e.
#' the number of different resource species it interacted with) should be
#' maintained when allocating individual resources/quantities. Default is FALSE.
#' @param summary.type An optional string indicating how the interactions should
#' be summarised at the species level: one of \code{"sum"} (the default),
#' indicating that the interactions between a consumer and resource species
#' will be summed across all individuals, or \code{"mean"}, indicating that
#' the mean value per individual of a consumer species will be calculated.
#' @param c.samples An optional vector that specifies names for subdivisions of the
#' interaction data when data on interactions and resource abundance were
#' collected across a series of subdivisions (e.g. at different sites or time
#' points). If suddivisions were present, they should be specified for both
#' \code{c.samples} and \code{r.samples}. The sample names must be identical
#' in \code{c.samples} and \code{r.samples} (although \code{c.samples} is likely to
#' be much longer than \code{r.samples} due to >1 individual consumer per
#' subdivision)
#' @param r.samples An optional vector specifying the sample names for the
#' resource abundance data (\code{resources}), and corresponding to the names
#' used in \code{c.samples}. Must have the name number of entries as there are
#' rows in \code{resources}. Not needed when \code{resources} contains a single
#' row/\code{c.samples} is absent.
#' @param r.weights An optional data frame or matrix specifying weights to be
#' applied to rows in \code{resources}: all entries must be in the range 0--1.
#' The first column should contain the consumer species names, one entry for
#' each species, within the remaining columns listing all species present in
#' \code{resources} (names must be identical and in the same order as in
#' \code{resources}). Mainly used to specify forbidden links by specifying
#' zero values for forbidden links and ones for the remaining entries. Only
#' one table is supplied, which then applies across all subdivisions of the
#' data (if present).
#' @param prog.count A logical value specifying whether the progress count
#' should be shown. Defaults to \code{TRUE}.
#'
#' @details A basic call to \code{generate_null_net} only needs two arguments:
#' \code{consumers} and \code{resources}, but it is recommended that
#' \code{sims} is also specified to run a larger number of iteration of the
#' null model (e.g. 1000). It is important to ensure that species names are
#' consistent throughout and that the resource species used as column names in
#' \code{resources}, \code{consumers} and (optionally) \code{r.weights} are in
#' the same order (both \code{consumers} and \code{r.weights} should be one
#' column wider than \code{resources}, because they include an extra column
#' at the start to list the consumer species).
#'
#' The same species can appear as both a consumer and a resource e.g. when a
#' species is both predator and prey in a food web. Interactions can be
#' excluded from the modelled networks by specifying forbidden links with
#' \code{r.weights}, based either on existing data/literature or hypothesed
#' choices, generating the network that would be created if those choices were
#' made. Placing limits on the feasible range of resources with which a
#' consumer interacts should lead to more realistic networks: great white
#' sharks not feeding on zooplankton in a marine food web, for example.
#'
#' The interactions between individual consumers and the resources may be
#' recorded in a range of different ways depending upon the empirical data
#' that are available, and these differences are handled by using the
#' optional \code{data.type} argument. Three types of data can be specified:
#' \enumerate{
#' \item \code{data.type = "names"} (the default value). This is the most
#' common type of data, recording one or more resource species with which
#' an interaction occurred, but without any attempt to quantify the
#' strength of the interaction. For data of this type, \code{resources}
#' should simply comprise 1s and 0s, indicating whether an interaction was
#' present or absent respectively: row sums may equal one or be >1 if
#' individual consumers can interact with multiple resources.
#' \item \code{data.type = "counts"}. These data record the number of times
#' an individual consumer interacted with different resource species e.g.
#' the number of times different flower species were visited by an
#' individual pollinator during a 5 minute observation period. When
#' modelling count data there is a choice of whether the total number of
#' interactions (an individual's row sum) can be distributed across all
#' potential resources, which may result in interactions with a different
#' number of resource species (i.e. a change in an individual's degree) or
#' whether the degree is held constant for each individual. This is
#' determined by the additional \code{maintain.d} argument:
#' \code{maintain.d = FALSE} (the default) does not constrain an
#' individual's degree, whereas \code{maintain = TRUE} does.
#' \item \code{data.type = "quantities"}. Quantitative data are obtained
#' from each individual, such as the proportion of a predator's gut
#' contents derived from different prey. The data in \code{consumers} can
#' be represented by either proportions (so that the row sum for an
#' individual = 1) or in the native units (so the total <> 1). As for count
#' data, the degree can be free or fixed at the individual consumer level
#' by using the \code{maintain.d} argument.
#' }
#' For data types 2 and 3, it does not matter what units are used (e.g. each
#' row does not need to add up to one). The row total will be maintained,
#' so the results are returned in the same units as the original data.
#'
#' One problem that may arise in specifying the null model is in situations
#' where an interaction is recorded with a particular resource, but that
#' resource was not actually detected in the abundance data (i.e. abundance =
#' 0). This may occur, for example, when a predator eats a rare species that
#' was missed during concomitant sampling of prey abundance: in effect, the
#' predator's 'sampling' was more comprehensive. The implication of this is
#' that the resource species will not be sampled in the null model -- a
#' potential source of bias. \code{generate_null_net} will issue a warning if
#' it detects this situation. Possible corrective actions include removing
#' that resource species altogether or replacing its zero abundance with a
#' small constant.
#'
#' @return Returns an object of class "nullnet", which is a list containing
#' the following components:
#' \describe{
#' \item{\code{rand.data}}{Data frame containing the results from all of the
#' iterations of the null model}
#' \item{\code{obs.interactions}}{Interaction matrix summarising the observed
#' interactions (from \code{consumers})}
#' \item{\code{n.iterations}}{The value of \code{sims} i.e. the number of
#' interations of the null model}
#' }
#'
#' @seealso \code{\link{test_interactions}}, \code{\link{plot_preferences}}
#'
#' @references Agusti, N., Shayler, S.P., Harwood, J.D., Vaughan, I.P.,
#' Sunderland, K.D. & Symondson, W.O.C. (2003) Collembola as alternative prey
#' sustaining spiders in arable ecosystems: prey detection within predators
#' using molecular markers. \emph{Molecular Ecology}, \strong{12}, 3467--3475.
#'
#' Davey, J.S., Vaughan, I.P., King, R.A., Bell, J.R., Bohan, D.A.,
#' Bruford, M.W., Holland, J.M. & Symondson, W.O.C. (2013) Intraguild
#' predation in winter wheat: prey choice by a common epigeal carabid
#' consuming spiders. \emph{Journal of Applied Ecology}, \strong{50},
#' 271--279.
#'
#' King, R.A, Vaughan, I.P., Bell, J.R., Bohan, D.A, & Symondson, W.O.C.
#' (2010) Prey choice by carabid beetles feeding on an earthworm community
#' analysed using species- and lineage-specific PCR primers. \emph{Molecular
#' Ecology}, \strong{19}, 1721--1732.
#'
#' Pearson, C.E., Symondson, W.O.C., Clare, E.L., Ormerod, S.J.,
#' Iparraguirre Bolanos, E. & Vaughan, I.P. (2018) The effects of
#' pastoral intensification on the feeding interactions of generalist
#' predators in streams. \emph{Molecular Ecology}, \strong{27}, 590-602.
#'
#' Vaughan, I.P., Gotelli, N.J., Memmott, J., Pearson, C.E., Woodward, G. &
#' Symondson, W.O.C. (2018) econullnetr: an R package using null models to
#' analyse the structure of ecological networks and identify resource
#' selection. \emph{Methods in Ecology and Evolution}, \strong{9}, 728--733.
#'
#' @examples
#' null.1 <- generate_null_net(Silene[, 2:7], Silene.plants[, 2:6], sims = 10,
#' data.type = "names", summary.type = "sum", c.samples = Silene[, 1],
#' r.samples = Silene.plants[, 1])
#'
#' @export
generate_null_net <- function(consumers, resources, sims = 100,
data.type = "names", maintain.d = NULL,
summary.type = "sum", c.samples = NULL,
r.samples = NULL, r.weights = NULL,
prog.count = TRUE){
# Ensure input data are in data frame format (important when data are stored
# in tidyverse 'tibble' format)
consumers <- as.data.frame(consumers)
consumers[, 1] <- as.factor(consumers[, 1])
resources <- as.data.frame(resources)
if(!is.null(c.samples)) c.samples <- as.data.frame(c.samples)[, 1]
if(!is.null(r.samples)) r.samples <- as.data.frame(r.samples)[, 1]
if(!is.null(r.weights)) r.weights <- as.data.frame(r.weights)
# --------------------------------------
# Initial error handling:
# --------------------------------------
# 1. Column names are identical (names and order):
if (!identical(colnames(consumers)[-1], colnames(resources))) {
stop("Resource names do not match in 'consumers' and 'resources'")}
if(!is.null(r.weights) && (!identical(colnames(r.weights)[-1],
colnames(resources)))) {
stop("Names of 'r.weights' do not match names of 'resources'")}
# 2. Either both or neither of r.samples and c.samples are present
if (sum(is.null(r.samples), is.null(c.samples)) == 1) {
stop("Only one of 'r.samples' and 'c.samples' supplied")}
# 3. Check resource abundance data: length and sample codes
# When sample codes are not present, 'resources' should be a single row
if (is.null(r.samples) && (nrow(resources) != 1)) {
stop("Resource abundances should have one row")}
# When sample codes are present, the the length of 'c.samples' and 'r.samples'
# should match the number of rows in 'consumers' and 'resources' respectively,
# and the factor levels should be the same
if(!is.null(r.samples) && {
(nrow(resources) != length(r.samples)) ||
(nrow(consumers) != length(c.samples)) ||
(!identical(sort(unique(r.samples)), sort(unique(c.samples))))}) {
stop("There is a problem with the sample codes:
'c.samples' and/or 'r.samples'
are of incorrect length or their sample codes do not match")}
# 4. Abundance weights (typically forbidden link values) are in the range 0 - 1
if(!is.null(r.weights)) {
if(max(r.weights[, -1]) > 1 || min(r.weights[, -1]) < 0) {
stop("Abundance weights ('r.weights') must be between 0 and 1")}}
# 5. Issue a warning (cf. error) if a resource is consumed where the recorded
# abundance of the resource = 0 i.e. a resource that cannot be consumed in
# the null model. This may or may not require action by the user.
if (!is.null(c.samples)) {
spp.consumed <- stats::aggregate(consumers[, -1], by = list(c.samples), sum)
spp.consumed <- spp.consumed[order(spp.consumed$Group.1, decreasing = FALSE), ]
all.res <- cbind(r.samples, resources)
all.res <- all.res[order(all.res$r.samples, decreasing = FALSE), ]
if(sum(spp.consumed[, -1] > 0 & all.res[, -1] == 0) > 0) {warning(
"One or more instances detected where a consumer interacted with a
resource that has zero abundance in 'resources'")}
} else {
spp.consumed <- colSums(consumers[, -1])
all.res <- resources
if(sum(spp.consumed >0 & all.res == 0) > 0) {warning(
"There is at least one instance where a resource was consumed but
was zero in the abundance data (i.e. 'resources')")}
}
# 6. Correct types of data for data.type = "names" and "counts"
if(data.type == "names" & {sum(consumers[, -1] == 1 | consumers[, -1] == 0) !=
(nrow(consumers) * ncol(consumers[, -1]))}) {
stop("Entries in the consumer data should equal 0 or 1")}
if(data.type == "counts" & {sum(consumers[, -1] - round(consumers[, -1], 0)) > 0}) {
stop("Entries in the consumer data should be integers")}
# --------------------------------------
# --------------------------------------
# Set up a data frame that summarises the consumer data, containing:
# 1. The original order of the data; 2. Consumer species; 3. Number of
# interactions per individual and total interactions (sum); 4. Sample
# codes (or "A" if samples not specified)
if (!is.null(c.samples)){
c.summary <- data.frame(ord = seq(1, nrow(consumers)), c.sample = c.samples,
species = consumers[, 1],
links = rowSums(consumers[, 2:ncol(consumers)] != 0),
total = rowSums(consumers[, 2:ncol(consumers)]))
} else {
c.summary <- data.frame(ord = seq(1, nrow(consumers)),
c.sample = rep("A", nrow(consumers)),
species = consumers[, 1],
links = rowSums(consumers[, 2:ncol(consumers)] != 0),
total = rowSums(consumers[, 2:ncol(consumers)]))
}
# --------------------------------------
# --------------------------------------
# Create data frame 'rd' by adding sample codes, where specified, to the
# resource abundance data. Otherwise a null sample value 'A' is added
if (!is.null(c.samples)){
rd <- cbind.data.frame(r.samples, resources)
colnames(rd)[1] <- "r.sample"
} else {
rd <- cbind.data.frame(rep("A", nrow(resources)), resources)
colnames(rd)[1] <- "r.sample"
}
# --------------------------------------
# --------------------------------------
# 1. Add in the summary data for individual consumers ('c.summary')
# 2. Multiply the resource abundances by the r.weights matrix (if specified)
rd2 <- merge(c.summary, rd, by.x = "c.sample", by.y = "r.sample")
rd2 <- rd2[order(rd2$ord), ]
if (!is.null(r.weights)) {
abund.wghts <- merge(c.summary, r.weights, by.x = "species",
by.y = colnames(r.weights[1]))
abund.wghts <- abund.wghts[order(abund.wghts$ord), ]
# Error handling - check:
# 1. The columns of consumer names match in the resource abundance 'rd2' and
# abundance weights 'abund.wghts' data frames
# 2. The resource names and order match in 'rd2' and 'abund.wghts' before
# multiplying them together.
if (!identical(abund.wghts$species, rd2$species))
{stop("Consumer names in 'r.weights' and 'resources' do not match")}
if (!identical(colnames(abund.wghts[, 6:ncol(abund.wghts)]),
colnames(rd2[6:ncol(rd2)])))
{stop("Resource names in 'r.weights' and 'resources' do not match")}
rd2[, 6:ncol(rd2)] <- rd2[, 6:ncol(rd2)] * abund.wghts[, 6:ncol(abund.wghts)]
}
# Error handling: check that the number of potential resources => the maximum
# number of interactions of individual consumers once any forbidden links
# are applied (within each sub-division of the data, where present)
max.n.res <- data.frame(Subdiv = paste(rd2[, 1], rd2[, 3], sep = ""),
Links = rd2$links,
Nonzero = rowSums(rd2[, -c(1:5)] != 0))
max.n.res <- stats::aggregate(max.n.res[, 2:3], by = list(max.n.res$Subdiv), max)
if(sum(max.n.res$Links > max.n.res$Nonzero) > 0) {stop(
"Some consumers interact with more resources than are available (>0 abundance)
in 'resources', so the network cannot be modelled correctly")}
# --------------------------------------
# ======================================
# Outer loop of the null model
for (h in 1:sims){
# Create a data frame to contain the simulated data
res.df <- rd2[, 3:ncol(rd2)]
res.df[, 4:ncol(res.df)] <- 0
# -------------------------
# 1. Qualitative (0/1 data)
if (data.type == "names") {
for (i in 1:nrow(res.df)) {
res.choice <- sample(colnames(rd2[, 6:ncol(rd2)]), size = rd2$links[i],
prob = rd2[i, 6:ncol(rd2)], replace = FALSE)
res.df[i, res.choice] <- 1
}
}
# -------------------------
# 2. Count data
if (data.type == "counts") {
if (is.null(maintain.d) | FALSE) { # Not maintaining the degree
for (i in 1:nrow(res.df)) {
res.choice <- sample(colnames(rd2[, 6:ncol(rd2)]), size = rd2$total[i],
prob = rd2[i, 6:ncol(rd2)], replace = TRUE)
pc <- data.frame(table(res.choice))
rownames(pc) <- pc$res.choice
res.df[i, rownames(pc)] <- pc[rownames(pc), 2]
}
}
if (!is.null(maintain.d) && TRUE) { # Maintaining the degree - 2 stage process
for (i in 1:nrow(res.df)) {
# Pt 1 - select the resource taxa
choice <- sample(colnames(rd2[, 6:ncol(rd2)]), size = rd2$links[i],
prob = rd2[i, 6:ncol(rd2)], replace = FALSE)
choice <- data.frame(Taxon = choice, pt1 = 1)
# Pt 2 - allocate additional interations so the modelled total = observed total
choice.2 <- sample(choice$Taxon, size = rd2$total[i] - rd2$links[i],
replace = TRUE)
choice.2 <- data.frame(table(choice.2))
choice <- merge(choice, choice.2, by.x = "Taxon", by.y = "choice.2")
choice$count <- rowSums(choice[, -1])
rownames(choice) <- choice$Taxon
res.df[i, rownames(choice)] <- choice[rownames(choice), 4]
}
}
}
# -------------------------
# 3. Measured quantities
if (data.type == "quantities") {
if (is.null(maintain.d) | FALSE) { # Not maintaining the degree
for (i in 1:nrow(res.df)) {
resource.props <- as.matrix(rd2[i, 6:ncol(rd2)] /
sum(rd2[i, 6:ncol(rd2)]))
res.choice <- gtools::rdirichlet(1, resource.props)
res.choice <- res.choice * rd2[i, "total"]
res.df[i, 4:ncol(res.df)] <- res.choice
}
}
if (!is.null(maintain.d) && TRUE) { # Maintaining the degree - 2 stage process
for (i in 1:nrow(res.df)) {
# Pt 1 - select the resource taxa
choice <- sample(colnames(rd2[, 6:ncol(rd2)]), size = rd2$links[i],
prob = rd2[i, 6:ncol(rd2)], replace = FALSE)
choice <- data.frame(Taxon = choice, pt1 = 1)
# Pt 2 - allocate interations among the selected taxa
rp <- as.matrix(rd2[i, as.character(choice$Taxon)])
colnames(rp) <- choice$Taxon
resource.props <- as.matrix(rp / sum(rp))
repeat{
res.choice <- gtools::rdirichlet(1, resource.props)
if(min(res.choice) > 0) break
}
colnames(res.choice) <- colnames(rp)
res.choice <- res.choice * rd2[i, "total"]
res.df[i, colnames(res.choice)] <- res.choice[, colnames(res.choice)]
}
}
}
# --------------------------------------
# --------------------------------------
# Prepare output from the iteration of the model (iterations indexed by
# the value of h)
if (summary.type == "sum") {
res.agg <- stats::aggregate(res.df[, 4:ncol(res.df)],
by = list(res.df$species), sum)
colnames(res.agg)[1] <- "Consumer"
res.agg <- cbind(Iteration = as.matrix(rep(paste("It.", h, sep=""),
length(res.agg[, 1]))), res.agg)
ifelse(h == 1, res.compiled <- res.agg, {
res.compiled <- rbind(res.compiled, res.agg)})
}
if (summary.type == "mean") {
res.agg <- stats::aggregate(res.df[, 4:ncol(res.df)],
by = list(res.df$species), mean)
colnames(res.agg)[1] <- "Consumer"
res.agg <- cbind(Iteration = as.matrix(rep(paste("It.", h, sep=""),
length(res.agg[, 1]))), res.agg)
ifelse(h == 1, {res.compiled <- res.agg}, {
res.compiled <- rbind(res.compiled, res.agg)})
}
# --------------------------------------
# Progress count
if (prog.count == TRUE){Sys.sleep(0.02); print(h); utils::flush.console()}
}
# --------------------------------------
# Compile output objects:
# 1. Observed consumer-resource interactions
# 2. Null model outputs
if (summary.type == "sum") {
obs.interactions <- stats::aggregate(consumers[, 2:ncol(consumers)],
list(consumers[, 1]), sum)
colnames(obs.interactions)[1] <- "Consumer"
null.interactions <- stats::aggregate(res.compiled[, 3:ncol(res.compiled)],
list(res.compiled$Consumer), mean)
colnames(null.interactions)[1] <- "Consumer"
if(!identical(colnames(obs.interactions), colnames(null.interactions)) |
!identical(obs.interactions$Consumer, null.interactions$Consumer))
{stop("Different observed and expected taxon names")}
}
if (summary.type == "mean") {
obs.interactions <- stats::aggregate(consumers[, 2:ncol(consumers)],
list(consumers[, 1]), mean)
colnames(obs.interactions)[1] <- "Consumer"
null.interactions <- stats::aggregate(res.compiled[, 3:ncol(res.compiled)],
list(res.compiled$Consumer), mean)
colnames(null.interactions)[1] <- "Consumer"
if(!identical(colnames(obs.interactions), colnames(null.interactions) )|
!identical(obs.interactions$Consumer, null.interactions$Consumer))
{stop("Different observed and expected names")}
}
# --------------------------------------
# --------------------------------------
# Create output list
null.net.res <- list(rand.data = res.compiled, obs.interactions = obs.interactions,
n.iterations = sims)
class(null.net.res) <- "nullnet"
return(null.net.res)
# --------------------------------------
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.