knitr::opts_chunk$set(fig.width=8)

library(dplyr)
library(ggplot2)
library(ggthemes)
library(lme4)
library(LSRanalysis)
library(MASS)
library(plotly)
library(tidyr)
library(viridis)
data(LSR_monitor)

# # Will convert the 2007 observers to arbitrary numbers
# LSR_monitor$Observer <- ifelse(LSR_monitor$Observer == "JM",
#                                1,
#                                ifelse(LSR_monitor$Observer == "AP",
#                                       2,
#                                       ifelse(LSR_monitor$Observer == "BH",
#                                              3,
#                                              LSR_monitor$Observer)))
# LSR_monitor$Observer <- as.numeric(LSR_monitor$Observer)

Abstract

Since 2007, U.S. Fish and Wildlife Service personnel have monitored populations of Lilaeopsis schaffneriana var. recurva (LSR), a species listed as endangered under the U.S. Endangered Species Act, at San Bernardino and Leslie Canyon National Wildlife Refuges in southeast Arizona. The species is a desert wetland obligate whose range has been substantially reduced by more than a century of land use that reduced the number, extent, and natural dynamics of wetlands in Arizona and Sonora, Mexico. Here we analyze monitoring data for 2007 and 2010-2017. We show that the area occupied by and number of patches of LSR exhibits high inter-annual variability, with a general decline to 2015 and an increase in 2016-2017. The number of sites (separate groups of patches) where LSR is found on the refuges has steadily increased from one in 2007 to five in 2016, adding resiliency and redundancy for these populations. These results paint a generally positive picture of LSR status on the refuges, but the variability of populations highlights how biological and hydrological variability still pose a significant risk to the species.

Introduction

Since 2007, U.S. Fish and Wildlife Service personnel have monitored populations of Lilaeopsis schaffneriana var. recurva (LSR), a species listed as threatened under the U.S. Endangered Species Act (ESA), at San Bernardino and Leslie Canyon National Wildlife Refuges (SBLCNWR) in southeast Arizona. The species is a desert wetland obligate whose range has been substantially reduced by more than a century of land use that reduced the number, extent, and natural dynamics of wetlands in Arizona and Sonora, Mexico. LSR historically occurred on San Bernardino NWR but was apparently extirpated by the late 1990s, and plugs were transplanted in Leslie Canyon NWR in the 1990s, where the species still persists today.

The LSR monitoring protocol requires that multiple people independently identify patches of the species at a given site, sketch the extent of those patches, and measure the patches (to the nearest 0.1m) and record the measurements on the sketches. After field work is complete, the sketches and their measurements are returned to the lab and computer software - either TurboCAD or ArcGIS - is used to draw representations of each patch to the scale dictated by the measurements. From these digital renderings the areal extent of LSR can be estimated for each site and as measured by each observer.

Here we analyze LSR monitoring data from the years r unique(LSR_monitor$Year) and across r length(unique(LSR_monitor$Site)) sites at SBLCNWRs.

Data overview

Before attempting an analysis, we provide a summary of the data. First, a simple look at the first few lines of data:

dat <- head(LSR_monitor)
knitr::kable(dat, align="c", digits = 2, caption = "__Table 1.__ The first six rows of the LSR monitoring data set.")

Note that the Obs_2 variable is a re-coding of the Observer variable, which is needed for the Analysis, below.

Next, we need to know the distribution of the area occupied:

qplot(LSR_monitor$Area, geom = "histogram") +
  labs(x = "Area (m^2)") +
  theme_hc()

And summarize the other variables:

n_years <- length(unique(LSR_monitor$Year))
n_sites <- length(unique(LSR_monitor$Site))
patches <- mean(tapply(LSR_monitor$Patch,
                       INDEX = LSR_monitor$Year,
                       FUN = function(x) length(unique(x))))
observs <-  mean(tapply(LSR_monitor$Observer,
                       INDEX = LSR_monitor$Year,
                       FUN = function(x) length(unique(x))))
df <- data.frame(n_years, n_sites, patches, observs)
names(df) <- c("# Years", "# Sites", "Mean # patches", "Mean # observers")
knitr::kable(df, align = "c", digits = 2, caption = "__Table 2.__ Basic summary of other variables in the LSR monitoring data set.")

Figures

As an initial overview of relationships among variables, we will calculate the sum of the area estimated by each observer each year and plot those points and boxplots:

by_yr <- aggregate(Area ~ Year + Obs_2, data = LSR_monitor, FUN = sum, na.rm =TRUE)
ggplot(data = by_yr, aes(x = factor(Year), y = Area)) +
  geom_boxplot(fill = "white") +
  geom_jitter(colour = "tan4", size = 4, alpha = 0.8, width = 0.3, height = 0.1) +
  labs(x = "Year",
       y = "Area (m^2)") +
  theme_hc()

This figure shows that the total amount of LSR on SBLCNWR is highly variable between years. The between-observer variation tends to be small relative to the inter-annual variation, but we're not sure what is going on with the 2011 and 2012 data. (Most years the individual point estimates are relatively tight, but there's an obvious outlier for 2011 [low] and 2012 [high].)

What are the median and mean estimates of LSR area occupied by year?

atab <- aggregate(Area ~ Year, 
                  data = by_yr, 
                  FUN = function(x) round(median(x), 2))
btab <- aggregate(Area ~ Year, 
                  data = by_yr, 
                  FUN = function(x) round(mean(x), 2))
ctab <- left_join(atab, btab, by = "Year")
names(ctab) <- c("Year", "Median Area", "Mean Area")
knitr::kable(ctab, align = "c", caption = "__Table 3.__ The median and mean estimates of area occupied by LSR by year, without accounting for observer variability.")

At how many sites and how many patches was LSR found, by year?

atab <- aggregate(Site ~ Year, 
                  data = filter(LSR_monitor, !is.na(Area)), 
                  FUN = function(x) length(unique(x)))
atab$Variable <- rep("# Sites", length(atab[[1]]))
names(atab)[2] <- "Count"

btab <- aggregate(Patch ~ Year + Observer, 
                  data = filter(LSR_monitor, !is.na(Area)), 
                  FUN = function(x) length(unique(x)))
btab <- aggregate(Patch ~ Year,
                  data = btab,
                  FUN = mean)
btab$Variable <- rep("# Patches", length(btab[[1]]))
names(btab)[2] <- "Count"

dtab <- rbind(atab, btab)

p <- ggplot(dtab, aes(Year, Count, colour = Variable)) +
     geom_line(size = 1.5) +
     geom_point(size = 4) +
     ylim(c(0, 15)) +
     theme_hc() +
     theme(legend.position = "right") +
     scale_color_viridis(discrete = TRUE)
p

The inter-annual variation of area occupied by LSR and number of discrete patches is relatively high and there is no single trend: the apparent negative trend in number of patches is driven by the high number of patches in 2007. In contrast, the number of sites at which LSR is found generally increased since 2007, except for the loss of two sites in 2017.

area_number <- left_join(ctab, atab, by = "Year")
area_number$Year <- factor(area_number$Year)
p <- ggplot(data = area_number, aes(`Median Area`, Count, colour = Year)) +
     geom_point(size = 4, alpha = 0.8) +
     labs(x = "Area (median)",
          y = "# Patches") +
     theme_hc() +
     theme(legend.position = "right") +
     scale_color_viridis(discrete = TRUE)
p

Analysis

Analyzing the LSR monitoring data is a little tricky. We're interested in estimating a population "trend," but need to also determine how important inter-individual variation is to such a trend estimate. The observers are fundamentally a random effect, which necessitates a mixed-model approach. To evaluate the relative importance of Year vs. Observer, we can craft a set of models with different parameterizations, including fixed Year effects with a polynomial (order = 2), random Year effects, and random Observer effects. If Observer is relatively unimportant, we should expect models lacking the term to fit substantially better than models with the term, or models with both terms fitting about the same as Year-only models.

mod1 <- lmer(Area ~ poly(Year, 2) + (1|Obs_2), data = LSR_monitor)
mod1b <- lmer(Area ~ (1|Year) + (1|Obs_2), data = LSR_monitor)
mod2 <- lm(Area ~ poly(Year, 2), data = LSR_monitor)
mod3 <- lmer(Area ~ (1|Obs_2), data = LSR_monitor)
mod4 <- lmer(Area ~ (1|Year), data = LSR_monitor)

anova(mod1, mod1b, mod2, mod3, mod4)

We see that models 1b and 4 have nearly-identical AIC scores that are nearly 50 AIC units lower than the next-best model. Both of these models include Year as a random effect term but without a polynomial (Year is treated as a discrete factor rather than a continuous numeric variable). That is, these models fit better than all other models considered. Because they are nearly equivalent, but model 1b has an extra term for Observer to achieve similar fit, we can conclude that the inter-annual variation is more important for explaining the variation in the data.

Discussion

Population monitoring is a key component of managing wildlife, especially when the managed species is threatened or endangered. Since 2007, FWS personnel have monitored the populations of an endangered plant, Lilaeopsis schaffneriana var. recurva. This analysis shows that the total area occupied by the species and the number of patches of the species exhibit high inter-annual variability. Variation in water availability (especially in Leslie Canyon), flooding, and succession by other herbaceous vegetation all contribute to the variability. It also shows that the number of sites where LSR is found has steadily increased since 2007, which is good thing because more sites translates to increased resiliency and redundancy, two of three components of the Three Rs for conservation.

We found that inter-observer variability is small relative to inter-annual variability. One response to this result could be that additional observers are not necessary, but we would caution against this conclusion for two reasons. First, while the 2007-2017 data exhibited high inter-annual variability, a period of low variability could make accounting for observer variability much more important. Consider the area occupied during the period 2013-2015 (Figure 2): it's possible to connect a single dot from each of those years to show a population increase or decline. But with the data points from all four observers it is clear that there is a lot of observer variability and no conclusion about a trend one way or another. Second, the outlier total area estimates from 2011 and 2012 highlight that there can be dramatic differences among observers. The raw data from the field is no longer available for these two years, and even if it was, there is no way to travel back in time to double-check the measurements. We recommend that the field data be entered (i.e., the polygons drawn and areas estimated), and the per-observer estimates of total area calculated, within a day or two of making the measurements. If there is one outlier, then the sites should be revisited by all observers (if possible) to try to determine the cause of the discrepancy. It may be that one or two measurements were mis-recorded, or that the observer responsible for the outlier total saw things that others did not. Either way, the use of multiple observers is key to getting as accurate an estimate of area occupied as possible, which is difficult when talking about fuzzily-defined patches of a species in nature.

In summary, the data indicate LSR is subject to high inter-annual variation, likely driven by a combination of hydrological and biological factors. The 2016 data in particular was encouraging because it shows the combination of larger area occupied and a larger number of occupied sites than in the past. The 2017 data includes a slight increase in the area occupied by LSR, but two sites (Leslie Canyon and North Pond) were lost because of drought. If drought conditions improve, then we anticipate transplants will help boost the number of sites back up again.



jacob-ogre/LSRanalysis documentation built on May 18, 2019, 8 a.m.