knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
The catchMSY
package is intended to be used to determine MSY-based reference points.
The catchMSY package is based on the initial work for Martell and Froese (2012). "A simple method for estimating MSY from catch and resilience".
This is a simple example based on the Namibian hake dataset published in the ecoloigical detective (Hilborn and Mangle, 1997).
In this vignette I use the Namibian Hake as an example of how to use the catchMSY package using:
Install the R package from github using devtools:
devtools::install_github("smartell/catchMSY")
Then load the library.
library(catchMSY)
First, we'll have a look at the input data and the Stock ID object sID
. To create a new stock ID object we use the function new_sID
. The data for this example were taken from chapter 10 in the Ecological Detective (Hilborn and Mangel, 1997).
data(NamibianHake) names(hake) head(hake$data)
There are default values set in new_sID
for setting up a new sID
object. Make sure that you adjust the values in the sID
object that match the life history for the stock you are analyzing.
For example, you can change the age at 50% selectivity sel1
and age at 95% selectivity sel2
for the hake example by adjusting the values in the list called hake
:
# Change selectivity hake$sel1 <- 4.0 hake$sel2 <- 5.0
Given a valid `sID' object, you can quickly run the age-structured model and preview the results.
hakeOM <- catchMSYModel(hake) str(hakeOM)
The catchMSY
package automatically defines priors on natural mortality (M; par1), fishing mortality at maximum sustainable yield (Fmsy; par2), and maximum sustainable yield (MSY; par3).
hake$dfPriorInfo
The distribution and/or values defining the prior distribution can be adjusted for each parameter.
For example, you can adjust the mean and standard deviation of the lognormal prior on natural mortality to include uncertainty from the assumed mean in the hake model.
# Set parameter sampling frame hake$dfPriorInfo$dist[1] = "lnorm" hake$dfPriorInfo$par1[1] = log(hake$m) hake$dfPriorInfo$par2[1] = 0.05 * hake$m
The prior on MSY can be set as a function of the natural mortality in the hake model.
hake$dfPriorInfo$dist[2] = "unif" hake$dfPriorInfo$par1[2] = 0.20 * hake$m hake$dfPriorInfo$par2[2] = 1.50 * hake$m
We set the prior on MSY as a uniform distribution, assuming the true MSY lies somewhere between the 5th and 95th percentiles of the observed catch.
hake$dfPriorInfo$dist[3] = "unif" hake$dfPriorInfo$par1[3] = quantile(hake$data$catch,0.05) hake$dfPriorInfo$par2[3] = quantile(hake$data$catch,0.95)
There is also an option of setting a prior distribution on the age at 50% selectivity. As an example, we set this up similar to the prior on natural mortality:
## age at 50% selectivity selexPriorInfo <- data.frame("id"=4, "dist"="lnorm", "par1"=log(hake$sel1), "par2"=0.1*hake$sel1, "log"=TRUE, "stringAsFactors"=FALSE) hake$dfPriorInfo <- rbind.data.frame(hake$dfPriorInfo, selexPriorInfo) hake$dfPriorInfo
Once the life history and prior information are set up in the object sID
, use the function sample.sid
to generate random samples from the prior distributions. Specifying selex=TRUE
includes the 4th parameter, age at 50% selectivity, in the list of parameters to draw samples from the prior distribution.
# set seed for samples set.seed(123)
# Generate random samples from dfPriorInfo hake <- sample.sid(sID=hake, selex=TRUE, n=1000) colnames(hake$S) <- c("M", "Fmsy", "MSY", "sel50") head(hake$S)
par(mfrow=c(1,1)) pairs(hake$S, gap=0, pch=20, cex=0.5)
The dataframe data
in the object sID
should only include year and catch when running the catch-only model.
M0 <- hake M0$data <- M0$data[,c("year", "catch")] head(M0$data)
Because the object M0
already has the dataframe S
with the random draws from the prior distributions for each parameter, we can go ahead and run the model. This will loop through each of the sampled parameter combinations and test whether those values could have led to an extant population via the age-structured model.
M0 <- sir.sid(sID=M0, selex=TRUE, ncores=2)
The catch-only model removes the combinations of parameter values that do not lead to an extant population (i.e. that do not pass the non-statistical criteria). The black points in the scatterplot are the parameter combinations that lead to an extant population; the different color points represent parameter combinations that failed the non-statistical criteria (population crashed).
# catch-only scatterplot pairs(M0$S, gap=0, col=as.numeric(unlist(M0$code))+1, pch=20)
To integrate an abundance index into the assessment method, we specify two extra columns in the data
data frame within the sID
object - "index", representing the abundance index in each year, and "index.lse", representing some assumed value for the log standard error for the observed abundance index in each year.
M1 <- hake # year, catch, and index M1$data <- M1$data[,c("year","catch","index","index.lse")]
To include a biomass survey instead of an abundance index, the column names must be "biomass" and "biomass.lse", with NAs in years with no data.
You can then run the model the same way as in the catch-only scenario.
M1 <- sir.sid(sID=M1, selex=TRUE, ncores=2)
The same as the catch-only model, each parameter combination must pass the non-statistical criteria. For each combination that passes, the model calculates a negative log likelihood between the observed and predicted values for the abundance index. The parameter combinations are then weighted by their fit between the observed and predicted data.
head(M1$nll)
The below pairs plot uses red points to indicate combinations that would have been removed if only catch data were included in the model. The light blue points indicate parameter combinations that are discarded based on their poor fit to the abundance index. This leaves a much narrower range of possible values for each parameter (i.e. more certainty).
# catch and index scatterplot M1_cols <- rep(1, 1000) M1_cols[which(1:1000 %in% unique(M1$idx)==FALSE)] <- "steelblue" M1_cols[which(as.numeric(unlist(M0$code))>0)] <- "red" pairs(M1$S, gap=0, col=M1_cols, pch=20)
No length information is available in the Namibian hake dataset. We can use the predicted length information (mean length and length composition) from the function catchMSYModel
as pseudo-data to demonstrate the new routines to fit to a mean length time series or length composition data.
hakeOM <- catchMSYModel(hake) ML <- hakeOM$ML LC <- t(hakeOM$LF) par(mfrow=c(1,2)) plot(ML, pch=17, cex=2, xlab="Year", ylab="Mean length (cm)", type="o", ylim=c(0, max(ML, na.rm=TRUE)*1.1)) barplot(LC[nrow(LC),1:ncol(LC)], xlab="Length bin (1 cm)", ylab="Frequency", xaxt="n") axis(1, at=pretty(c(1,ncol(LC))))
To fit to the mean length time series, simply add new columns to the data
data frame in the sID
object in the same way the "index" and "index.lse" time series were added.
M2 <- hake # year, catch, and index M2$data <- cbind(M2$data[,c("year","catch")], "avgSize"=ML, "avgSize.lse"=rep(0.2, length(ML))) M2 <- sir.sid(sID=M2, selex=TRUE, ncores=2)
In the below pairs plot, the red points indicate the parameter values that are discarded due to the catch data alone, and dark blue points indicate the parameter values that are discarded based on their poor fit to the mean length data. There are not many dark blue points, indicating the mean length data was not as informative for the estimated parameters as the abundance index.
# catch and index scatterplot M2_cols <- rep(1, 1000) M2_cols[which(1:1000 %in% unique(M2$idx)==FALSE)] <- "blue" M2_cols[which(as.numeric(unlist(M0$code))>0)] <- "red" pairs(M2$S, gap=0, col=M2_cols, pch=20)
To set up the length composition data, add a column called "lencomp_sd" right after the catch data in the data
dataframe of the sID
object. Then add columns with the number of individuals captured in each length bin in each year, designated as LC_ plus the upper value of the length bin as the column name (example: LC_1, LC_2, etc.)
M3 <- hakeOM M3$data <- cbind(M3$data[,c("year","catch")], "lencomp_sd"=rep(0.2, length(M0$data$year)), LC) head(M3$data)
In general, the sampling-importance-resampling (SIR) algorithm is not efficient for fitting length composition data in the catch-MSY model. The proposed values for Fmsy and MSY produce an expected population that is too far from the truth to predict a length composition shape that is close enough to the truth. Therefore, the likelihood values are very low, resulting in none of the parameter combinations chosen as possibilities in the SIR algorithm. In this case, length composition does not provide any additional information over the catch-only model. It is not currently recommended to fit the catch-MSY method with length composition data due to these issues with the SIR algorithm.
Hilborn, R. and Mangel, M. (1997). The ecological detective: confronting models with data. Princeton Univ Pr.
Martell, S. and Froese, R. (2012). A simple method for estimating msy from catch and resilience. Fish and Fisheries.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.