The R package EchoNet2Fish estimates fish abundance from acoustic echoes and midwater trawl catch. At its core are functions that explore the data, exploreACMT() or exploreACMT2(), and generate the estimates, estimateLake(). The code is tailored to the format and procedures used by the Great Lakes Acoustic Users Group (Parker-Stetter et al. 2009).

Install

Install the EchoNet2Fish package. If you don't already have the devtools package installed, you can follow the instructions at Readme instead.

devtools::install_github("JVAdams/EchoNet2Fish")

Then load the EchoNet2Fish package.

library(EchoNet2Fish)

Prepare the data

Before any estimates are made, the acoustic and midwater trawl data must be prepared in the following way.

File organization

For each set of data for which you would like to generate estimates (e.g., for one year of data from one lake), you should have a single sub directory (subdir) containing all of the relevant files. For example, you might have a subdirectories called H13, H14, and M14, containing all of the data for Lake Huron in 2013 and 2014 and Lake Michigan in 2014. Within this subdirectory, you should have two more subdirectories for the acoustic data (one for all the Sv files and one for all the TS files), and you should have all of your midwater trawl files (operations, catches, lengths, and age-length keys). Above this subdirectory should be an overarching directory (refdir) that contains a reference csv file (refcsv).

Reference file

The reference csv file contains information, primarily the directory and file names for the acoustic and midwater trawl data, for all of the subdirectories (one row for each subdirectory). The file must contain these 10 columns:

There should also be one or more additional columns for keyvars, the key variable(s) used to define each unique run of the exploration and estimation process. In the example below, LAKE and YEAR are used as the key variables.

refdir <- "C:/Temp"
refdat <- data.frame(
  LAKE = c(3, 3, 2), 
  YEAR = c(2013, 2014, 2014), 
  subdir = c("H13", "H14", "M14"), 
  svsubdir = c("SV", "SV", "SV"), 
  tssubdir = c("TS", "TS", "TS"), 
  optropf = c("H.mtr.op13", "H.mtr.op14", "MtrawlOp"), 
  trcatchf = c("H.catch13", "H.catch14", "M.catch"), 
  trlff = c("Htr_lf13", "H.tr_lf14", "M.tr_lf"), 
  keysp1 = c(NA, NA, 106), 
  keyfile1 = c(NA, NA, "aleagelenkey"), 
  keysp2 = c(NA, NA, NA), 
  keyfile2 = c(NA, NA, NA))
write.csv(refdat, paste(refdir, "Reference.csv", sep="/"))
knitr::kable(refdat)

Acoustic files

Except where noted, the order of the rows and columns in any of the acoustic or midwater trawl files is not important. Additional columns, i.e., other than those that are required, may be included in the files, but are not necessary.

In both the SV and TS csv files, the first column is automatically assigned the name Dummy_ID. This name write-over is needed to handle occasional problems with byte order marks at the beginning of the csv files. The variable Dummy_ID is not actually used in the exploration or estimation procedures.

The following 8 columns must be included in both the SV and TS csv files:

These columns serve to uniquely identify each row in the files, and are used to combine the information from both types of files.

The following 4 columns must be included in the SV csv files:

mydir <- readAll(refdir="C:/JVA/Consult/Warner/Nearest Trawl", keyvals=c(2, 2014), 
  keyvars=c("LAKE", "YEAR"), rdat="ACMT", refcsv="Reference")
load(paste(mydir, "ACMT.RData", sep="/"))
sv.vars <- c("Dummy_ID", "Region_name", "Interval", "Layer", "Date_M",
  "Sv_mean", "Depth_mean", "Layer_depth_min", "Layer_depth_max", "Lat_M",
  "Lon_M", "PRC_ABC")
set.seed(395)
look <- sv[sample(dim(sv)[1], 5), sv.vars]
look$Dummy_ID <- rep(1, 5)
look$Date_M <- format(look$Date_M, "%Y%m%d")
look$Sv_mean <- round(look$Sv_mean, 1)
look$Depth_mean <- round(look$Depth_mean, 1)
look$Lat_M <- round(look$Lat_M, 2)
look$Lon_M <- round(look$Lon_M, 2)
look$PRC_ABC <- format(look$PRC_ABC, digits=3)
knitr::kable(look, row.names=FALSE)

Several columns must be included in the TS csv files indicating the number of targets in each target strength bin. Each target count columns is named according to the integer target strength, with an X. prefix, e.g.:

ts.vars <- c("Dummy_ID", "Region_name", "Interval", "Layer", "Layer_depth_min",
  "Layer_depth_max", "Lat_M", "Lon_M", "X.36", "X.37", "X.38", "X.64", "X.65",
  "X.66")
look <- ts[sample(dim(ts)[1], 5), ts.vars]
look$Dummy_ID <- rep(1, 5)
knitr::kable(look, row.names=FALSE)

Midwater trawl files

The following 1 column must be included in the midwater trawl operation, catch, and length csv files:

This column serves to uniquely identify each trawl haul in the files.

The following 9 columns must be included in the midwater trawl operation csv files:

op.vars <- c("Op.Id", "Year", "Lake", "Beg.Depth", "End.Depth", 
  "Fishing_Depth", "Transect", "Latitude", "Longitude")
look <- optrop[sample(dim(optrop)[1], 5), op.vars]
look$Latitude <- round(look$Latitude, 2)
look$Longitude <- round(look$Longitude, 2)
knitr::kable(look, row.names=FALSE)

The following 3 columns must be included in the midwater trawl catch csv files:

ct.vars <- c("Op.Id", "Species", "Weight", "N")
look <- trcatch[sample(dim(trcatch)[1], 5), ct.vars]
knitr::kable(look, row.names=FALSE)

The following 3 columns must be included in the midwater trawl length csv files:

lf.vars <- c("Op.Id", "Species", "Length", "N")
look <- trlf[sample(dim(trlf)[1], 5), lf.vars]
knitr::kable(look, row.names=FALSE)

Note that the number of fish measured (N in the length csv file) need not be the same as the total number captured (N in the catch csv file). All proportions based on size are calculated by first scaling up the measured fish to the total catch, to account for those instances when only a subset of the catch is measured.

Age-length key files

No age-length key is required, but you may include age-length keys in separate csv files for up to 2 different species. Each file should be arranged such that rows represent 10-mm length categories and columns represent 1-yr ages. The file must contain one variable called mmgroup giving the midpoint of the length category, e.g., mmgroup 15 represents fish $\geq$ 10 mm and < 20 mm. For each length category, the proportion of fish in each age is represented by a series of columns which sum to 1 (or 0, if no fish used to derive the key were found that length category). Each proportional column is named according to the integer age, with an Age prefix, e.g.:

look <- key1[9:13, 2:9]
look <- apply(look, 2, round, 3)
knitr::kable(look, row.names=FALSE)

Read in the data

With the data organized as described and the reference file as a guide to the directory and file names, you can now read in the acoustic and midwater trawl data using the readAll() function.

In the example below, the overarching directory refdir is C:/Temp (use forward slashes for paths), and the reference file refcsv is Reference.csv. Only data from the 3rd row of the reference file will be read, specified by keyvals for the keyvars, i.e., LAKE==2 and YEAR==2014. The data are saved to an RData file rdat named ACMT, and the subdirectory path is returned.

mydir <- readAll(refdir="C:/Temp", keyvals=c(2, 2014), 
  keyvars=c("LAKE", "YEAR"), rdat="ACMT", refcsv="Reference")
mydir <- "C:/Temp/M14/"
mydir

Explore the data

Now you're ready to explore the data with the exploreACMT() or exploreACMT2() function. The exploreACMT() function was designed specifically for use with data from the US Geological Survey - Great Lakes Science Center. The exploreACMT2() function is a simplified version that does not expect the additional columns the USGS data contains; use this one if in doubt.

The AC and MT arguments are used to indicate if you want to explore the AC and MT data. The ageSp argument gives the species codes for species that you want to apply age-length keys to; in this example ageSp=106 for alewife. The short argument is used to indicate if the surveyed area is wider (in the east-west direction) than tall (in the north-south direction); in this example short=FALSE because Lake Michigan is not wider than tall.

exploreACMT(maindir=mydir, rdat="ACMT", AC=TRUE, MT=TRUE, ageSp=106, short=FALSE)

When you run this function, a rich text file (rtf) with a *.doc file extension (so that it will be opened with Word by default) is saved to maindir. It includes a long series of tables and figures summarizing the variables in the acoustic and midwater trawl data files. These are designed to help the investigator look for potential problems in the data.

Generate estimates

Once you've error checked the data and are confident that they are in good shape, you are ready to generate lake-wide estimates.

Regions

Define the strata used to design the survey. These are assumed to be non-overlapping two-dimensional regions that correspond to Region_name in the SV and TS files and Transect in the midwater trawl operation file. Also supply the surface areas (in ha) that correspond to each stratum. If you did not use strata in the design of your survey, just use a single region.

Mreg <- c("nn", "sn", "wn", "no", "so")
MArea <- c(10933, 8716, 6010, 12630, 10487)

Size information

Create a data frame of species-specific size information for each species group for which you wish to generate abundance estimates. The data frame has five variables: sp species code, spname species name, lcut length cut off (in mm), lwa and lwb parameters of the length-weight relation, $Wg = lwaLmm^{lwb}$, where Wg is the weight (in g) and Lmm* is the total length (in mm). The length cut off is used to divide the data for a given species into two groups for abundance estimation, those with fish lengths $\leq$ lcut and > lcut) for estimation. If you don't wish to divide a species into two length groups, set lcut to 0 for that species. This will be used as input to the estimateLake() function.

myspInfo <- data.frame(
  sp = c(106, 109, 129),
  spname = c("alewife", "rainbow smelt", "threespine stickleback"),
  lcut = c(100, 90, 0),
  lwa = c(1.41e-05, 4.85e-06, 3.95e-05),
  lwb = c(2.87, 3.03, 2.59)
)

Slices

The fish density in acoustic cell (interval-layer) is apportioned to species using the composition from the nearest midwater trawl within a given slice. Slices are not necessarily the same as strata, which are used in the design of the survey and the estimation of the total population.

The slices can be defined by combinations of fishing depths (fdp), bottom depths (bdp), longitudes (lon), latitudes (lat), and regions (reg). The slice definition is described by a list of slices; each slice is a list of defining characteristics; and each characteristic is a named vector giving the range of values. This will be used as input to the sliceCat() function.

This is easier to explain with some examples. Let's say you want to define two slices, based on a dividing line at a fishing depth of 20 m. You would write your slice definition as

myslice1 <- list(
  epi  = list(fdp=c(-Inf,  20)),
  hypo = list(fdp=c(  20, Inf))
  )

This names the upper slice "epi" and the lower slice "hypo" (you can name the slices whatever you want, but they must be named). And it defines these slices by the corresponding range of fishing depths (fdp), from $\geq$ negative infinity to < 20 for epi and from $\geq$ 20 to < positive infinity for hypo.

Or, perhaps you want to define four slices, using the same premise as before, but now dividing the epilimnion into three parts according to latitude. You might write your slice definition as

myslice2 <- list(
  epi.south   = list(fdp=c(-Inf,  20), lat=c(-Inf,  43)),
  epi.central = list(fdp=c(-Inf,  20), lat=c(  43,  45)),
  epi.north   = list(fdp=c(-Inf,  20), lat=c(  45, Inf)),
  hypo        = list(fdp=c(  20, Inf))
  )

Similarly, slices may also be defined by ranges of bottom depths and longitudes. If you wish to define slices by regions that are not simply described by longitudinal or latitudinal breakpoints, for example separating the bays from the main basin, you might write your slice definition as

myslice3 <- list(
  epi.bays = list(fdp=c(-Inf,  20), reg=c("Bay A", "Bay B")),
  epi.main = list(fdp=c(-Inf,  20), reg="Main"),
  hypo     = list(fdp=c(  20, Inf))
  )

I'll provide some fake data to demonstrate how these slice definitions compare.

fishingD <- c(13, 10, 17, 15, 18, 22, 21, 25, 24, 26)
latitude <- c(42, 44, 44, 46, 47, 46, 46.1, 66, 43.2, 41)
region <- c("Bay A", "Bay B", "Main")[c(3, 1, 3, 2, 3, 1, 2, 3, 3, 3)]
s1 <- sliceCat(myslice1, fdp=fishingD)
s2 <- sliceCat(myslice2, fdp=fishingD, lat=latitude)
s3 <- sliceCat(myslice3, fdp=fishingD, reg=region)
data.frame(fishingD, latitude, region, s1, s2, s3)

Estimates

Now you're ready to use the estimateLake() function to generate lake-wide estimates in both number (millions) and biomass (t). The region and regArea arguments are used to indicate the regional strata and their corresponding areas (in ha). The TSrange is the target strength range of interest. The TSthresh is the minimum threshold for the number of binned targets in a cell (layer by interval) for calculating target strength. The psi is the transducer-specific two-way equivalent beam angle in steradians. Species of interest are specified by soi, and each of these species should have size information in the data frame specified by spInfo. The slice definitions by which the estimates should be summarized are specified by the sliceDef argument. Finally, the descr argument is used to add a little descriptive text to the names of the output files.

estimateLake(maindir=mydir, rdat="ACMT", ageSp=106, region=Mreg, regArea=MArea,
  TSrange=c(-60, -30), TSthresh=1, psi=0.007997566, soi=c(106, 109, 129),
  spInfo=myspInfo, sliceDef=myslice1, short=FALSE, descr="vignette")

When you run this function, a rich text file (rtf) with a *.doc file extension (so that it will be opened with Word by default) is saved to maindir. The document includes figures showing the location and apportionment of midwater trawl hauls in the slices and spatial maps of density for each species group, as well as tables of the lake-wide estimates in both number (millions) and biomass (t) for each species group and slice.

In addition, six different data frames of estimates are saved as objects in an RData file and are written to csv files:

The rtf, RData, and csv files are all named using the lake, the year, and the descr text.

Summary

Below is a summary of the code used in this vignette, simplified somewhat by relying on default values wherever possible.

library(EchoNet2Fish)

# read in the data
mydir <- readAll(refdir="C:/Temp", keyvals=c(2, 2014))

# explore the data
exploreACMT(maindir=mydir, ageSp=106, short=FALSE)

# define survey strata
Mreg <- c("nn", "sn", "wn", "no", "so")
MArea <- c(10933, 8716, 6010, 12630, 10487)

# define species size info
myspInfo <- data.frame(
  sp = c(106, 109, 129),
  spname = c("alewife", "rainbow smelt", "threespine stickleback"),
  lcut = c(100, 90, 0),
  lwa = c(1.41e-05, 4.85e-06, 3.95e-05),
  lwb = c(2.87, 3.03, 2.59)
)

# define summary slices
myslice1 <- list(
  epi  = list(fdp=c(-Inf,  20)),
  hypo = list(fdp=c(  20, Inf))
  )

# generate estimates
estimateLake(maindir=mydir, ageSp=106, region=Mreg, regArea=MArea,
  soi=c(106, 109, 129), spInfo=myspInfo, sliceDef=myslice1, short=FALSE, descr="vignette")

References

Parker-Stetter, S. L., Rudstam, L. G., Sullivan, P. J., and Warner, D. M. 2009. Standard operating procedures for fisheries acoustic surveys in the Great Lakes. Great Lakes Fish. Comm. Spec. Pub. 09-01. www.glfc.org/pubs/SpecialPubs/Sp09_1.pdf.



JVAdams/EchoNet2Fish documentation built on Feb. 15, 2021, 4:27 a.m.