knitr::opts_chunk$set(collapse = TRUE
                      , comment = "#>")

Introduction

MBSStools was created by Tetra Tech, Inc. in 2017 to meet the needs of Maryland Department of Natural Resources (DNR), Maryland Biological Stream Survey (MBSS) staff to complete data manipulations in an efficient and reproducible manner.

It is envisioned that this library will be a living program and will add additional functions and routines in the future.

Many of the examples in this vignette are included in the example sections in the corresponding functions. Each function in the MBSStools library includes example data from MBSS.

Installation

The R library is saved on GitHub (https://github.com/leppott/MBSStools) for ease of updating and distribution. Issues can be tracked, fixed, and code is available for download. Current users of MBSStools can update the library using the same code used to install the library (see below). Devtools is required to download from GitHub. At this time, there are no plans to submit MBSStools to CRAN (R's library repository).

install.packages("devtools")
library(devtools)
install_github("leppott/MBSStools")

To contact the author directly email Erik.Leppo@tetratech.com.

Packages

Several packages were used to build the functionality in MBSStools.

  1. FlowSum; no extra packages

  2. IonContrib; no extra packages

  3. MapTaxaObs; readxl and rgdal

  4. metric.scores; dplyr

  5. metric.values; dplyr

  6. PHIcalc; no extra packages

These packages should install automatically when MBSStools installs. But if you encounter issues with a function not working ensure that the necessary package dependencies are installed.

To install packages follow the example of the code below for installing dplyr.

install.packages("dplyr")

Contents

There are several functions included in the library each with a particular focus on a dataset and the necessary calculations for data analysis.

  1. Index of Biological Integrity (IBI) calculations for fish and benthic macroinvertebrates.

  2. Taxonomic distribution maps for fish.

  3. Stream discharge calculation.

  4. Ion contribution to conductivity.

  5. Physical Habitat Index (PHI) calculation.

Example data are included with each function.

IBI

MBSStools calculates both the benthic macroinvertebrate IBI (BIBI) and fish IBI (FIBI). The IBIs are described in the document Southerland et al. 2005 that is included in the /extdata folder of the library.

There are two functions; metric.values() to calculate the metrics (e.g., total individuals or total taxa) and metric.scores() that converts the metric values to the appropriate score (e.g., 1, 3, or 5). The functions are the same for fish and benthic macroinvertebrates. Only the argument fun.Community needs to be changed to "bugs" or "fish".

MBSStools does not aggregate or compile taxa lists. Taxa data with site/sample information, counts, phylogenetic (phylum, class, order, etc), and autecological (habit, tolerance value, and function feeding group) information is required as an input to the functions. Fish data needs the additional values of watershed (catchment) area (in acres), average stream width, and reach length.

A background table (metrics_scoring) is included and provides a table with the metric names and scoring thresholds for each IBI and index region. This table is populated with the appropriate values for the MBSS IBIs. This table can be modified.

The functions metric.values() and metric.scores() require the dplyr library.

IBI, Fish

Calculates the fish IBI.

The adjustment of metrics based on watershed area is automatic for fish metrics.

Fish metric values assumes the following fields (all upper case)

** Sites with a single record indicating "No Fish Observed" receive a default score of 1.00.

# Metrics, Fish
library(MBSStools)
#(generate values then scores)
myIndex <- "MBSS.2005.Fish"
# Thresholds
thresh <- metrics_scoring
# get metric names for myIndex
(myMetrics.Fish <- as.character(unique(thresh[thresh[,"Index.Name"]==myIndex,"MetricName.Other"])))
# Taxa Data
myDF.Fish <- taxa_fish
myMetric.Values.Fish <- metric.values(myDF.Fish, "fish", myMetrics.Fish, TRUE)
knitr::kable(head(myMetric.Values.Fish))
# SCORE
Metrics.Fish.Scores <- metric.scores(myMetric.Values.Fish
                                     , myMetrics.Fish
                                     , "Index.Name"
                                     , "FIBISTRATA"
                                     , thresh)
# View Results
knitr::kable(head(Metrics.Fish.Scores))
#Write results to csv file
#write.csv(Metrics.Fish.Scores, "Filename.csv")

IBI, Benthic Macroinvertebrates, MBSS

Bug metric values assumes the following fields (all upper case)

Valid values for FFG: COLLECTOR, FILTERER, PIERCER, PREDATOR, SCRAPER, SHREDDER

Valid values for HABIT: BU, CB, CN, SP, SW

Maryland Stream Waders (MSW) data should be first combined to family level and EXCLUDE recalculated.

Additional fields needed:

In addition to the metric scores a QC check on samples with too many (>120) or too few (<60) organisms is returned in the results. Those with more than the maximum (120) should be subsampled down to within the target range.

# Metrics, Index, Benthic Macroinvertebrates, genus
# (generate values then scores)
myIndex <- "MBSS.2005.Bugs"
# Thresholds
thresh <- metrics_scoring
# get metric names for myIndex
(myMetrics.Bugs.MBSS <- as.character(unique(thresh[thresh[,"Index.Name"]==myIndex,"MetricName.Other"])))
# Taxa Data
myDF.Bugs.MBSS <- taxa_bugs_genus
myMetric.Values.Bugs.MBSS <- metric.values(myDF.Bugs.MBSS, "bugs", myMetrics.Bugs.MBSS)
# SCORE
Metrics.Bugs.Scores.MBSS <- metric.scores(myMetric.Values.Bugs.MBSS, myMetrics.Bugs.MBSS, "INDEX.NAME", "STRATA_R", thresh)
#Write results to csv file
#write.csv(Metrics.Bugs.Scores.MBSS, "Filename.csv")

# QC bug count
Metrics.Bugs.Scores.MBSS[Metrics.Bugs.Scores.MBSS[,"totind"]>120,"QC_Count"] <- "LARGE"
Metrics.Bugs.Scores.MBSS[Metrics.Bugs.Scores.MBSS[,"totind"]<60,"QC_Count"] <- "SMALL"
Metrics.Bugs.Scores.MBSS[is.na(Metrics.Bugs.Scores.MBSS[,"QC_Count"]),"QC_Count"] <- "OK"
# table of QC_Count
knitr::kable(head(Metrics.Bugs.Scores.MBSS))

IBI, Benthic Macroinvertebrates, MSW

The Maryland Stream Wader index is from Stribling et al. 1999. Metric values and scores shown below for example data.

# Metrics, MSW Index, Benthic Macroinvertebrates, family
myIndex <- "MSW.1999.Bugs"
# Thresholds
thresh <- metrics_scoring
# get metric names for myIndex
(myMetrics.Bugs.MSW <- as.character(unique(thresh[thresh[,"Index.Name"]==myIndex,"MetricName.Other"])))
# Taxa Data
myDF.Bugs.MSW <- taxa_bugs_family
myMetric.Values.Bugs.MSW <- metric.values(myDF.Bugs.MSW, "bugs", myMetrics.Bugs.MSW)
knitr::kable(head(myMetric.Values.Bugs.MSW))
# SCORE
Metrics.Bugs.Scores.MSW <- metric.scores(myMetric.Values.Bugs.MSW, myMetrics.Bugs.MSW, "INDEX.NAME", "STRATA_R", thresh)
# View Results
knitr::kable(head(Metrics.Bugs.Scores.MSW))

Fish Distribution Maps

The intent of this function is to recreate the taxa distribution maps on the MBSS website: http://eyesonthebay.dnr.maryland.gov/mbss/fishes.cfm. The maps for all taxa can be generated from a single line of code. The maps generated with this function use a 'crosswalk' table that converts taxa names to map names.

The function MapTaxaObs() requires the readxl and rgdal packages.

Data inputs

The user will need GIS files for the state of Maryland for State, County, and Water.

The inputs for this function, besides the three GIS shapefiles, are taxa observations and a taxa name crosswalk (translation) table between taxa names and map names.

All inputs are included in the MBSStools library and can be copied to the current directory with the code below.

# Set Working Directory
wd <- getwd()
# Create Example Data if Needed
## Create Directories
myDir.create <- file.path(wd,"Data")
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
myDir.create <- file.path(wd,"GIS")
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
myDir.create <- file.path(wd,"Maps")
ifelse(dir.exists(myDir.create)==FALSE,dir.create(myDir.create),"Directory already exists")
## Create Data
### Taxa Data
myFiles <- c("AllFish_95to16.xls", "TaxaMapsCrossWalk20170731.xlsx")
file.copy(file.path(path.package("MBSStools"),"extdata",myFiles),file.path(wd,"Data",myFiles))
### GIS
unzip(file.path(path.package("MBSStools"),"extdata","MD_GIS.zip"),exdir=file.path(wd,"GIS"))

Example Input Data

Observation data with three columns needed for the tool to work (common name, latitude, and longitude). Additional columns can be appended (e.g., SiteYr) and will not affect the functionality.

# library
library(readxl)

# data files
obs <- "AllFish_95to16.xls"
xWalk <- "TaxaMapsCrossWalk20170731.xlsx"

all.taxa <- system.file("extdata", obs, package = "MBSStools")
taxa.lu <- system.file("extdata", xWalk, package = "MBSStools")

# Read in Taxon observations
df.taxa.obs <- readxl::read_excel(all.taxa, sheet=1,col_names=TRUE,skip=0)
df.taxa.obs <- as.data.frame(df.taxa.obs)
df.taxa.obs[,1] <- tolower(df.taxa.obs[,1])
#df.taxa.obs <- as.data.frame(cbind(tolower(df.taxa.obs[,"TaxaName"]), df.taxa.obs[,2:3]))
colnames(df.taxa.obs)[1] <- "CommonName"

head(df.taxa.obs)

Crosswalk data with common name and map name.

# library
library(readxl)

# data files
obs <- "AllFish_95to16.xls"
xWalk <- "TaxaMapsCrossWalk20170731.xlsx"

all.taxa <- system.file("extdata", obs, package = "MBSStools")
taxa.lu <- system.file("extdata", xWalk, package = "MBSStools")

# Read in TaxaMapsCrossWalk.xlsx
df.lu.taxa <- readxl::read_excel(taxa.lu,sheet=1,col_names=TRUE,skip=0)
df.lu.taxa <- as.data.frame(df.lu.taxa[,c("CommonName","MapName")])
df.lu.taxa[,"CommonName"] <- tolower(df.lu.taxa[,"CommonName"])
# df.lu.taxa <- as.data.frame(cbind(tolower(df.lu.taxa[,"CommonName"]),df.lu.taxa[,"MapName"]))
# colnames(df.lu.taxa)[1:2] <- c("CommonName","MapName")

head(df.lu.taxa)

Map Creation

The code below can be used to generate the maps (n=131) associated with the 1995 to 2016 data. Any taxa without a match in the crosswalk table (n=54) are listed in a CSV file and the number reported to the user in the R console, but maps are not generated (by default). If the user wants all maps then the parameter 'onlymatches' should be set to 'FALSE' (as in the example below).

Updated in 2024 to replace the rgdal package (deprecated in 2023) with the sf package.

# Set Working Directory
wd <- getwd()

# Inputs
Obs <- "AllFish_95to16.xls"
XWalk <- "TaxaMapsCrossWalk20170731.xlsx"
wd <- getwd()

# Create maps
MapTaxaObs(Obs, XWalk, wd, onlymatches = FALSE)

The intent behind this function was to recreate the fish taxa distribution maps on the DNR website. The data inputs are not specific to fish and can be used for benthic macroinvertebrate data.

The taxa distribution map for brown trout is shown below.

Points with latitude and longitude are imported as WGS84 (CRS = 4326) then converted to NAD83 Maryland (meters) (CRS = 26985) before plotting using the sf package.

# example map.
# include only brown trout
# brntrout.png
# assume already created directories

# Packages
library(MBSStools)
library(readxl)
library(sf)

# Set Working Directory
wd <- tempdir()
# Create Example Data if Needed
## Create Directories
## Create a “Data” folder, a “GIS” folder, and a “Maps” folder
myDir.create <- file.path(wd, "Data")
ifelse(dir.exists(myDir.create) == FALSE, 
       dir.create(myDir.create), 
       "Directory already exists")
myDir.create <- file.path(wd, "GIS")
ifelse(dir.exists(myDir.create) == FALSE,
       dir.create(myDir.create),
       "Directory already exists")
myDir.create <- file.path(wd, "Maps")
ifelse(dir.exists(myDir.create) ==FALSE,
       dir.create(myDir.create),
       "Directory already exists")

## Create Data
## Make copies of datafiles and place into “Data” folder
### Taxa Data
myFiles <- c("AllFish_95to16.xls", "TaxaMapsCrossWalk20170731.xlsx")
file.copy(file.path(path.package("MBSStools"), "extdata", myFiles),
          file.path(wd, "Data", myFiles))

### GIS
# Unzip GIS file in MBSS tools “extdata” folder and place in “GIS” folder
unzip(file.path(path.package("MBSStools"), "extdata","MD_GIS.zip"), 
      exdir = file.path(wd, "GIS"))

# data files
obs <- "AllFish_95to16.xls"
xWalk <- "TaxaMapsCrossWalk20170731.xlsx"

dirMain     <- wd  
dirData     <- "Data"
dirGIS      <- "GIS"
verbose     <- TRUE
onlymatches <- TRUE

all.taxa <- system.file("extdata", obs, package = "MBSStools")
taxa.lu <- system.file("extdata", xWalk, package = "MBSStools")

# Read in Taxon observations
df.taxa.obs <- readxl::read_excel(all.taxa,
                                  sheet = 1,
                                  col_names = TRUE,
                                  skip = 0)
df.taxa.obs <- as.data.frame(df.taxa.obs)
df.taxa.obs[,1] <- tolower(df.taxa.obs[, 1])
colnames(df.taxa.obs)[1] <- "CommonName"

## Trim to browntrout
df.taxa.obs <- df.taxa.obs[df.taxa.obs[, "CommonName"] == "brown trout", ]

# Read in TaxaMapsCrossWalk.xlsx
df.lu.taxa <- readxl::read_excel(taxa.lu,
                                 sheet = 1,
                                 col_names = TRUE,
                                 skip = 0)
df.lu.taxa <- as.data.frame(df.lu.taxa[, c("CommonName", "MapName")])
df.lu.taxa[,"CommonName"] <- tolower(df.lu.taxa[, "CommonName"])

# 4. Munge Data ####
# Compare TaxaName to CommonName
taxa.names <- as.vector(df.lu.taxa[, "CommonName"])
matches <- as.vector(df.taxa.obs[, "CommonName"]) %in% taxa.names
df.taxa.nomatch <- as.data.frame(
  unique(sort(df.taxa.obs[,"CommonName"][!matches])))
colnames(df.taxa.nomatch)[1] <- "CommonName"
df.taxa.match <- as.data.frame(
  unique(sort(df.taxa.obs[, "CommonName"][matches])))
colnames(df.taxa.match)[1] <- "CommonName"

#if (onlymatches == TRUE) {
    # Create a data frame of common names and filenames for matching taxa
    map.taxa <- subset(df.lu.taxa, CommonName %in% df.taxa.match[,"CommonName"],
                       select = c(CommonName,MapName))

# 5. Mapping ####
ppi <- 72
dsn <- file.path(dirMain, dirGIS)
state     <- sf::st_read(dsn = dsn, layer = "MD_State_Boundary")
coastline <- sf::st_read(dsn = dsn, layer = "MD_Coast_Hydrology")
counties  <- sf::st_read(dsn = dsn, layer = "MD_Boundary_County_Detailed")

i<-1

taxon <- as.character(map.taxa$CommonName[i])

  filename <- map.taxa$MapName[i]
  df.taxon.sites <- subset(df.taxa.obs, df.taxa.obs[, "CommonName"] == taxon)
  df.taxon.sites <- subset(df.taxon.sites, !is.na(df.taxon.sites["Latitude83"]))

  plot(sf::st_geometry(state), 
       col = "white", 
       border = "gray")

  plot(sf::st_geometry(coastline), 
       add = TRUE, 
       col = "light blue", 
       border = FALSE)

  plot(sf::st_geometry(counties), 
       add = TRUE, 
       col = "white",
       border = "darkslategray", 
       lwd = 0.5)

  ## sites
  # 4326, WGS 84
  # 6487, NAD83(2011)/Maryland in meters
  # 6488, NAD83(2011)/Maryland in US survey feet
  # 26985, NAD83/Maryland                             *this one*
  # 2248
  sites_wgs84 <- sf::st_as_sf(df.taxon.sites,
                             coords = c("Longitude83", "Latitude83"),
                             crs = 4326)
  sites_nad83md_m <- sf::st_transform(sites_wgs84, crs = 26985)
  ### plot
  plot(sites_nad83md_m,
       pch = 21, 
       col = "black", 
       bg  = "green", 
       cex = 1.0,
       type = "p",
       add = TRUE)

# open temp dir (Windows only)
#shell.exec(tempdir())

Stream Discharge

Calculate stream discharge based on field measurements. Side channels that are properly identified in the data will be included.

The function FlowSum() requires no additional packages.

library(MBSStools)
# data
MBSS.flow <- MBSS.flow
# calculate flow
flow.cell <- FlowSum(MBSS.flow, returnType="cell")
flow.sample <- FlowSum(MBSS.flow)
# examine data
View(flow.cell)
View(flow.sample)
# Example Save file (tab delimited text file)
datetime <- format(Sys.time(),"%Y%m%d_%H%M%S")
myYear <- "15"
#write.table(flow.sample,paste0("SumFlow",myYear,"_",datetime,".tab"),row.names=FALSE,sep="\t")

Ion Matrix

The IonContrib() function calculates conductivity contribution from provided ion concentrations using Kohlrausch's Law. If a total conductivity measurement is not provided, the total conductivity is calculated as the sum of all ions present in the data. If the user provides a conductivity measurement, then ion contributions are a percentage of that number. In the latter case, "Other" is added as an ion to capture any percentage of total conductivity not represented by the provided ions. Plotting is done outside of this function.

Data will need to be in "wide" format. That is, a single record for each sample and the ions to be analyzed in columns.

A reference table of ions and their equivalent ionic conductance at infinite dilution is provided with the function as "MBSS.Ion.Ref". The function allows for input of a user supplied data frame if this table needs updating with more ions.

The function IonContrib() requires no additional packages.

knitr::kable(MBSS.Ion.Ref)

The columns for the ions need to match the reference table names. The function IonContrib() includes a reference table that is used by default but the user can supply their own reference table.

knitr::kable(head(MBSS.Ion.Data))

Example code is below. The user would then save the output (e.g., write.csv()).

library(MBSStools)
# Load Data
# MBSS.Ion.Data = example dataset
data.ion <- MBSS.Ion.Data
# Load Reference Table
ref.ion <- MBSS.Ion.Ref
# Calculate Percent Contribution by Ion (data table only)
contrib.ion <- IonContrib(data.ion)
knitr::kable(head(contrib.ion))

Dot charts are a good way to display the data. But any chart of the user's preference can be used.

# Load Data
data.ion <- MBSS.Ion.Data
# Load Reference Table
ref.ion <- MBSS.Ion.Ref
# Calculate Percent Contribution by Ion (data table only)
contrib.ion <- IonContrib(data.ion)
knitr::kable(contrib.ion)
# Calculate Percent Contribution by Ion (data table and reference table)
contrib.ion.2 <- IonContrib(data.ion, ref.ion)
knitr::kable(contrib.ion.2)
# Calculate Percent Contribution by Ion (data table, reference table, and conductivity values)
# ## add dummy conductivity values for each sample
myCond <- "Cond"
data.ion[,myCond] <- 4000
contrib.ion.3 <- IonContrib(data.ion, ref.ion, myCond)
knitr::kable(contrib.ion.3)
# Save Results (use write.table())
# Plot Results
myIons <- c("Chloride (mg/L)", "Bromide (mg/L)", "Nitrate-N (mg/L)", "Sulfate (mg/L)", "Sodium (mg/L)",
     "Potassium (mg/L)", "Magnesium (mg/L)",  "Calcium (mg/L)" )
myIons.Contrib <- paste0("PctContrib.",myIons)
mySite <- "BELK-101-X"
data.plot <- subset(contrib.ion, contrib.ion[,"Site"]==mySite, select=c("Site","Cond.Total",myIons.Contrib,myIons))
## Plot, one site (with cond value)
dotchart(as.numeric(as.vector(data.plot[,myIons.Contrib])), labels=myIons, main=mySite, xlab="Percent Contribution", pch=19, pt.cex=1.2)
mtext(paste0("Conductivity (uS/cm) = ",round(data.plot[,"Cond.Total"],1)))
## Plot all sites using panel.dotplot() in the lattice package
#
## Plot as a horizontal bar plot
# need to change margins to fit labels
#par(mai=c(1,2,1,1))
barplot(as.numeric(as.vector(data.plot[,myIons.Contrib])), names.arg=myIons, main=mySite, xlab="Percent Contribution"
       , axes=TRUE, horiz=TRUE, las=1)
mtext(paste0("Conductivity (uS/cm) = ",round(data.plot[,"Cond.Total"],1)))

PHI

The PHIcalc() function calculates the MBSS Physical Habitat Index (PHI), Paul et al. 2003. The report is included in the /extdata folder of the library.

There are two versions of the calculation based on pre- and post-2000. The function determines the correct calculation based on the data provided.

The input is a data frame with column names matching the variables collected in the field along with SampleID, bioregion, and area (acres).

The function PHIcalc() requires no additional packages.

x <- MBSS.PHI
knitr::kable(head(x))

The function returns a dataframe of results that need to be saved (e.g., write.table()). The result table from the example data is shown below.

library(MBSStools)
# data
# MBSS.PHI = example dataset
myData <- MBSS.PHI
# calculate PHI
PHI <- PHIcalc(myData)
knitr::kable(head(PHI))

Shiny

Included in the package is an example Shiny package to calculate the three IBI's for MBSS. The the function runShiny() requires the “shinyjs” package.

library(MBSStools)
library(shinyjs)
runShiny()


leppott/MBSStools documentation built on March 5, 2025, 5:33 p.m.