knitr::opts_chunk$set(fig.align="center", external=TRUE, echo=TRUE, warning=FALSE, fig.pos="H")
This package contains functions to run the Length-based Integrated Mixed Effects (LIME) fisheries stock assessment method. The LIME package can be used in two ways:
LIME was developed for data-limited fisheries, where few data are available other than a representative sample of the size structure of the vulnerable portion of the population (i.e., length composition data from the catch) and an understanding of the life history of the species.
While length composition from the catch can hold information on fishing mortality, gear selectivity, and recruitment, with process and observation error it can be very difficult to tease apart the patterns and biological mechanisms from the noise. With additional information, such as even a few years of catch trends or an abundance index, analysts can include additional information to the LIME model to help tease apart these processes. For example, catch data is much more informative of the fishing mortality rate over time than the length composition data. Catch used alongside length composition data can help decrease bias and uncertainty in estimates of fishing mortality rates.
LIME also includes all years of data in the same analysis with a penalty on the annual fishing mortality rate to keep it from increasing or decreasing unrealistically between years. However, LIME assumes the selectivity curve is constant over time for each fleet.
LIME relaxes the equilibrium assumptions of other length-based methods by estimating annual fishing mortality and recruitment variation (among other parameters), deriving annual recruitment as a random effect.
See Rudd and Thorson (2017) in the reference list for full details of the model, including simulation testing to evaluate performance across life history types, population variability scenarios, and data availability scenarios, as well as violations of the model assumptions.
Please alert me to any bugs or issues by using GitHub.
Comments and suggestions for additional features are welcome and we can discuss via email at mbrudd@uw.edu.
devtools
: You can install the development version of the package from GitHub using the devtools
package:install.packages("devtools", repos='http://cran.us.r-project.org')
Follow the steps to install TMB:
Windows: https://github.com/kaskr/adcomp/wiki/Windows-installation
Linux: https://github.com/kaskr/adcomp (See README)
Mac: should be the same as Linux, please see contact info under Bug Reports if that doesn't work.
Install required package TMBhelper
:
devtools::install_github("kaskr/TMB_contrib_R/TMBhelper")
devtools::install_github("merrillrudd/LIME", dependencies=TRUE)
The installation may produce error messages requiring other packages. You can download those from CRAN if they are not automatically downloaded in the LIME installation process, and then start from the step that produced the error. Again, please use the contact info in Bug Reports if this doesn't work, as you may have discovered a new installation error... congratulations!
library(LIME) library(tidyverse)
The LIME package simulation and estimation features require the biological inputs (e.g. growth, natural mortality, maturity) and starting values for selectivity parameters in a list. Using create_lh_list
will make sure all required elements are included within the list and in the required format. The create_lh_list
function requires inputs for some parameters and includes default values for others.
create_lh_list
The user is required to input the following parameters to create_lh_list
for simulation and estimation:
Minimum inputs: Biology
linf
: von Bertalanffy asymptotic lengthvbk
: von Bertalanffy growth coefficientM
: Annual natural mortality ratelwa
: Length-weight scaling parameterlwb
: Length-weight allometric parameterM50
Length or age at 50% maturitymaturity_input
: Whether M50
input is in "length" or "age"Minimum inputs: Exploitation
S50
: Length or age at 50% selectivityS95
: Length or age at 95% selectivityselex_input
: Whether S50
and S95
are in "length" or "age"create_lh_list
with default settingsOther inputs: Biology
M95
: Length or age at 95% maturity; default=NULL for one-parameter logistic maturity. M95
will use two-parameter logistic maturity curve. It is assumed M95
should be input as length if M50
is length, same for age.t0
: Length at age=0; Default=-0.01, should be adjusted based on local studies or meta-analysis.R0
: Equilibrium recruitment; Default=1.0. Changing R0
will change the scale of the population size. It is recommended to set an initial value of R0
more appropriate to the expected population size if using catch data to estimate population size with LIME. h
: Steepness; Default=1.0 such that the expected recruitment calculated in the Beverton-Holt stock-recruit curve is not affected by the level of spawning biomass. Setting h
below 1.0 will lead to the expected annual recruitment as a function of the spawning biomass. LIME will still calculate recruitment deviates around this expected value of annual recruitment. AgeMax
: Maximum age; Default=-log(0.01)/M, the age at which 1% of the population is still alive based on the specified natural mortality rate M
. This age can be adjusted based on local evidence of maximum age.rho
: Recruitment autocorrelation, used in simulation only; Default=0 (no recruitment autocorrelation). Changing the value of rho
will add autocorrelation to the generated recruitment time series for a simulation study. Other inputs: Exploitation
S95
: Length or age at 95% selectivity; Default=NULL
for one-parameter logistic selectivity; specifying S95
will use two-parameter logistic selectivity curve. It is assumed S95
should be input as length if S50
is length, same for age.selex_type
: Selectivity function type; Default=logistic
using a one-parameter logistic selectivity curve if S95
is not specified, or a two-parameter logistic curve if S95
is specified. Alternate option=dome
to generate a population exploited based on a dome-shaped selectivity curve. Dome-shaped selectivity cannot currently be estimated in LIME, but specifying dome-shaped selectivity in the create_lh_list
function can generate an assumed dome-shaped curve to be fixed when using LIME for estimation. dome_sd
: Dome-shaped selectivity right-hand standard deviation; Default=NULL
for logistic selectivity, must be specified if selex_type="dome"
. The standard deviation of the normal distribution for the right side of the selectivity curve (after the logistic curve specified reaches 100% selectivity).Fequil
: The proportion of spawning biomass relative to unfished spawning biomass at bioeconomic equilibrium, default=0.5.Frate
: exponent determining the strength of coupling between effort and changes in biomass, default=0.2.qcoef
: Catchability coefficient; default=1e-5. Estimated when using an abundance index, starting value should be adjusted to a reasonable number based on the abundance index and starting value for the scaling parameter R0
.theta
: Dirichlet-multinomial parameter; Default=1, a relative small value. With a larger theta
(e.g. 10) there will be no variance inflation where the effective sample size will approach the input sample size (Thorson et al. 2017)nseasons
: Number of seasons in a year; Default=1 for annual length composition data and annual growth and mortality parameters. For short-lived species, it is helpful to use a shorter-than-annual time step to account for the growth of short-lived fish during one year, if length data is collected on time steps less than one year (e.g. month, nseasons
=12).nfleets
: Number of fleets to be modeled in the fishery; default=1 for a single fleet. With nfleets
greater than 1, the number of S50
, S95
, and selex_type
values must match the value for nfleets
. Example, S50
=c(20,26), S95
=c(25,35), selex_type
=rep('logistic', nfleets
)Other inputs: Variation
CVlen
: Coefficient of variation around the growth curve; Default=0.1; Should be adjusted based on the expected variability in the age-length curve.SigmaR
: Recruitment standard deviation; Default=0.737, the median across all fish species (Thorson et al. 2014); Used for generating recruitment deviates in the simulation or as a starting value for its estimation using LIME. SigmaF
: Fishing mortality standard deviation; Default=0.2; Used for generating fishing mortality deviates in the simulation or as the standard deviation of the fishing mortality penalty when estimating annual fishing mortality using LIME. SigmaC
: Catch standard deviation; Default=0.2; Used as the fixed standard deviation in the lognormal likelihood when fitting LIME to catch data.SigmaI
: Index standard deviation; Default=0.2; Used as the fixed standard deviation in the lognormal likelihood when fitting LIME to index data.Now let's populate the create_lh_list
function with example values.
lh <- create_lh_list(vbk=0.21, linf=65, t0=-0.01, lwa=0.0245, lwb=2.79, S50=c(20), S95=c(26), selex_input="length", selex_type=c("logistic"), M50=34, maturity_input="length", M=0.27, binwidth=1, CVlen=0.1, SigmaR=0.737, SigmaF=0.2, SigmaC=0.1, SigmaI=0.1, R0=1, Frate=0.1, Fequil=0.25, qcoef=1e-5, start_ages=0, rho=0.43, theta=10, nseasons=1, nfleets=1)
Now we should check out the biological parameters and selectivity we've created. Note that even though we input maturity and selectivity by length, create_lh_list
converts to age and outputs both selectivity-at-age by fleet (lh$S_fa
) and selectivity-at-length by fleet (lh$S_fl
).
par(mfrow=c(2,2), mar=c(4,4,3,1)) plot(lh$L_a, type="l", lwd=4, xlab="Age", ylab="Length (cm)") plot(lh$W_a, type="l", lwd=4, xlab="Age", ylab="Weight (g)") plot(lh$Mat_l, type="l", lwd=4, xlab="Length (cm)", ylab="Proportion mature") # plot selectivity for the first (and only) fleet (first row) plot(lh$S_fl[1,], type="l", lwd=4, xlab="Length (cm)", ylab="Proportion selected to gear")
Particularly for short-lived fish, it is possible that an annual time step is too coarse of a time scale to capture individual growth between years. For example, if a fish grows rapidly between ages 1 and 2, it is possible the probability of being a length given age will result in a negligible probability of being certain lengths. However, it is likely those lengths will be observed, and in this case LIME will not be able to fit the data well.
To check for this issue, we recommend plotting the probability of being in a length bin given age to make sure there is greater than negligible probability of observing all lengths. In this case, there is a small overlap between the distributions for age 1 and 2 fish, which should be enough to have an adequate model fit if fish are observed at approximately 17 cm.
For tips on what to do when there is a negligible probability of observing some lengths, see the "Format data" section in this document.
plba <- with(lh, age_length(highs, lows, L_a, CVlen))
ramp <- colorRamp(c("purple4", "darkorange")) col_vec <- rgb( ramp(seq(0, 1, length = nrow(plba))), max = 255) matplot(t(plba[-1,]), type="l", lty=1, lwd=3, col=col_vec, xaxs="i", yaxs="i", ylim=c(0, 0.5), xlab="Length bin (cm)", ylab="Density") legend("topright", legend=lh$ages[seq(2,length(lh$ages),by=3)], col=col_vec[seq(2,length(lh$ages),by=3)],, lwd=3, title="Age")
The first subsection on data formatting provides an overview of the simulation functions within the LIME package. To learn how to format data from your fishery, skip to the "Format data" section.
Using the life history list output from create_lh_list
, we can simulate a population and generate data.
The generate_data
function has several settings for which the user is required to input a value:
modpath
: model path for saving simulated populations; can set as NULL
to run within the R environment only without saving locally.itervec
: vector of iterations of simulated data; can be 1 to run only one iteration, or 1:100 to run 100 iterations of simulated populations.lh
: life history list, output from create_lh_list
.Fdynamics
: Pattern for fishing mortality dynamics; options include Constant
, Oneway
, Endogenous
, or None
.Rdynamics
: Pattern for recruitment dynamics; options include Constant
, or Pulsed
.Nyears
: number of years for your simulated population.Nyears_comp
: number of years to generate length data; e.g. if 10 years and Nyears
is 20 years, will be the last 10 years in the 20 year time series. Must correspond to the number of fleets, e.g. c(20,10) would mean 20 years of length data for fleet 1 and 10 years of length data for fleet 2.comp_sample
: nominal sample size of length data for each year; e.g. vector of length 20 (1 for each year), with 200 length measurements collected annually. If there is more than 1 fleet and only 1 number is specified, that number will apply to all fleets (e.g. 200 samples in each fleet).init_depl
: initial depletion, or the proportion of the unfished population biomass in the first year of the population to be modeled. Specifying a single value indicates the initial depletion (e.g. 0.5) or two values indicates the lower and upper bounds of a uniform distribution for which the population simulation will randomly draw a depletion value (e.g. c(0.1,0.9)).seed
: set a seed for random numbers for generating process deviations.There are also several other settings not required but may be useful:
rewrite
: default=TRUE to always re-simulated data. rewrite
=FALSE may be useful to only simulate a new population if the True.rds
output file already exists in the folder.pool
: default=TRUE, meaning that if the number of seasons per year as specified in the create_lh_list
function is more than 1, the length composition data collected in each time set is pooled together annually. pool
=FALSE would mean the generated length composition data is on a time step shorter than 1 year (e.g. monthly if nseasons
=12). The total sample size for the year will still be equal to comp_sample
.Now let's use generate_data
to simulate 1 example population (itervec=1) to generate length, catch, and an abundance index. We won't specify a modpath
so the simulated population will just be used locally in this vignette.
true <- generate_data(modpath=NULL, itervec=1, Fdynamics=c("Endogenous"), Rdynamics="Constant", lh=lh, Nyears=20, Nyears_comp=c(20), comp_sample=rep(200,20), init_depl=0.7, seed=123, fleet_proportions=1)
par(mfrow=c(3,2)) plot(true$F_ft[1,], type="l", lwd=4, xlab="Year", ylab="Fishing mortality", ylim=c(0,max(true$F_ft)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true$R_t, type="l", lwd=4, xlab="Year", ylab="Recruitment", ylim=c(0,max(true$R_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true$SPR_t, type="l", lwd=4, xlab="Year", ylab="SPR", ylim=c(0,1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true$D_t, type="l", lwd=4, xlab="Year", ylab="Relative spawning biomass", ylim=c(0,max(true$D_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true$Cw_ft[1,], type="l", lwd=4, xlab="Year", ylab="Catch", ylim=c(0,max(true$Cw_ft)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true$I_ft[1,], type="l", lwd=4, xlab="Year", ylab="Abundance index", ylim=c(0,max(true$I_ft)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5)
The LIME
package includes the plot_LCfits
function which can first be used to plot the length frequency data, and later used to plot the model fits to the length frequency data. The plot_LCfits
function requires the length frequency data in data frame form, which can be formatted correctly using the LFreq_df
function.
LF_df <- LFreq_df(true$LF)
The length frequency data frame can then be used in the plot_LCfits
function.
plot_LCfits(LF_df=LF_df)
LIME requires data input in a specific format:
Option 1: Matrix
With only one fleet, length composition data can be in a length frequency matrix, with years along the rows and length bins along the columns. Here are 5 years as an example:
LF_mat <- true$LF[,,1] LF_mat[1:5,]
Option 2: Array
The length composition data can be an array of matrices (one for each fleet). Columns should be named with the upper end of the length bin and rows should be named with the year.
First five years:
LF_array <- true$LF LF_array_sub <- array(LF_array[1:5,,1], dim=c(5, ncol(LF_array[,,1]), 1)) colnames(LF_array_sub) <- colnames(LF_mat) rownames(LF_array_sub) <- rownames(LF_mat)[1:5] LF_array_sub
Option 3: List
The length composition data could also be a list of matrices (one for each fleet).
One fleet, first five years:
LF_list <- list() LF_list[[1]] <- LF_array_sub[,,1] LF_list
Time series data (catch and abundance index) should me in matrix form with fleets along the rows and years along the columns. Rows and columns do not need to be explicitly named with fleet number or year, but the number of years should match the total number of years modeled. If data are not available in any given year there should be a negative place-holder.
With one fleet, the catch or abundance index time series should still be in matrix form rather than a vector (the model will be looking for the first row, for the one and only fleet).
Catch, one fleet:
true$Cw_ft
Abundance index, one fleet:
true$I_ft
Observations
LIME requires inputing observed data in a list form, including:
years
: inclusive years to be modeledLF
: length frequency data in matrix, array, or list formneff_ft
: effective sample size to use in case of multinomial distribution (otherwise use Dirichlet-multinomial to estimate effective sample size, or assume nominal sample size = effective sample size if not included)I_ft
: Index of abundance by fleet and time (if applicable)C_ft
: Catch by fleet and time (if applicable) in numbers or biomass (to be specified later) data_all <- list("years"=1:true$Nyears, "LF"=true$LF, "I_ft"=true$I_ft, "C_ft"=true$Cw_ft, "neff_ft"=true$obs_per_year)
Life history and starting values
LIME includes a function create_inputs
which checks the length frequency data and includes all input data, life history information, and starting values in a list together to input into LIME.
create_inputs
requires:
lh
: life history and starting values list output from create_lh_inputs
input_data
: input data list described in Data input section above. inputs_all <- create_inputs(lh=lh, input_data=data_all)
LIME is set up to estimate certain parameters and fix others by default.
LIME estimates by default:
log_F_ft
: matrix of fishing mortality estimates in log-space by fleet (rows) over time (columns)log_sigma_R
: recruitment standard deviation in log-spacelog_S50_f
: length at 50% selectivity in log space, one for each fleetlog_Sdelta_f
: difference between length at 95% selectivity and length at 50% selectivity in log-space, so that length at 95% selectivity can never be less than length at 50% selectivity, one for each fleetNu_input
: temporal variation in recruitment, treated as random effectParameters estimated by default under certain conditions:
log_theta
: Dirichlet-multinomial parameter relating to effective sample size (only estimated if using Dirichlet-multinomial to fit to length composition data, where run_LIME
argument LFdist=1
)beta
: equilibrium recruitment in log-space, serves as population scaling parameter (only estimated if fitting to catch data, run_LIME
argument data_avail
must include "Catch")log_q_f
: catchability coefficient in log-space, one for each fleet (only estimated if fitting to an index, run_LIME
argument data_avail
must include "Index")Other parameters that could be estimated but are fixed by default:
log_sigma_C
: observation error on catch, fixed at log(0.2)log_sigma_I
: observation error on abundance index, fixed at log(0.2)log_CV_L
: coefficient of variation on predicted age-length curve, fixed at log(0.1)run_LIME
functionThe function run_LIME
used to run LIME models has many settings, but hopefully most models will keep the defaults and use only a few arguments.
Basic inputs to run_LIME
:
modpath
: model path to save results and flags; default=NULL to save in R environment locallyinput
: list output from create_inputs
data_avail
: what data types will LIME fit to? "LC"= length composition only, "Index_LC"= index and length comps, "Catch_LC"= catch and length comps, and "Index_Catch_LC" = index, catch, and length comps. If fitting to catch data, the user must specify:
C_type
: default=0 (no catch data), 1 for catch in numbers, 2 for catch in biomassAdditional settings in run_LIME
:
LFdist
: which distribution to fit to length frequency data? default=1 to use Dirichlet-multinomial and estimate additional parameter related to effective sample size, multinomial = 0est_more
: additional parameters to estimate that are fixed by default (see Estimated parameters section)fix_more
: additional parameters to fix that are estimated by default (see Estimated parameters section) est_F_ft
: which F parameters to estimate? default=TRUE to estimate all. Otherwise, a matrix with fleets as rows and years along columns, with a 1 where the F parameter should be estimated and a 0 where the F parameter should be fixed at the starting value.f_startval_ft
: starting values for fishing mortality, default=NULL to start at 0.2 for all years and fleets. Otherwise, matrix of starting values for F where fleets are along the rows and years along the columns (e.g. can input the estimated F values from the previous run with Report$F_ft).rdev_startval_t
: starting values for recruitment deviations over time. default=NULL to start at 0 for all years. Otherwise, specify vector of recruitment deviations for all years to be modeled (e.g. can start at simulated truth and turn off estimation for debugging)est_selex_f
: estimate selectivity parameters? default=TRUE to estimate selectivity parameters for all fleets. Turn off selectivity estimation for all or a single fleet with FALSE (e.g. estimate selectivity parameters for fleet 1 but not fleet 2 with c(TRUE, FALSE))vals_selex_ft
: input selectivity-at-length (columns) by fleet (rows). default= matrix of negative values to estimate selectivity. Otherwise, input selectivity and set est_selex_f
=FALSE. This is how one would go about fixing the selectivity curve to a dome shape.randomR
: default=TRUE treats recruitment deviations as a random effect. FALSE turns the random effect off.newtonsteps
: number of extra Newton steps to take after optimization to help convergence, default=3; FALSE to turn off. Note: it seems with TMB that bounding parameters does not work when using newtonsteps. If a parameter is estimated outside of a reasonable bound, I recommend setting newtonsteps=FALSE.F_up
: upper bound on fishing mortality estimateS50_up
: upper bound on length at 50% selectivityFpen
: penalty on fishing mortality to avoid fishing mortality being estimated at drastically different values between years; default=1 on, otherwise 0=off.SigRpen
: penalty on sigmaR, default=1 on, otherwise off=0.SigRprior
: vector with prior info for normally distributed sigmaR penalty, first term is the mean and second term is the standard deviation; default =c(0.737, 0.3)derive_quants
: if TRUE, derive MSYbased reference points, default=FALSEest_totalF
: TRUE will estimate total fishing mortality across all fleets (number of parameters = number of years), not by fleet (where number of parameters would = number of years * number of fleets). default=FALSE estimates F by year and fleet.prop_f
: proportion of catch from each fleet, must sum to 1.0. When est_totalF
=TRUE, must assume how much of the catch comes from each fleet (e.g. c(0.5,0.5), c(0.6,0.4), etc.)mirror
: vector of parameter names matching those in Estimated parameters section to mirror between fleets (estimating one parameter instead of the same number of fleets)rewrite
: default=TRUE to rewrite LIME output (LIME_output.rds) in the directory modpath
. FALSE is useful for simulation testing if the process was stopped midway through 100 iterations, so a function can skip iterations with output.Settings in run_LIME
associated with simulation-testing:
simulation
: default=FALSE, but if TRUE, sets working directory in the iteration directory, not the general modpath
itervec
: vector of iterations for simulation, e.g. 1:100. default=NULL for stock assessment application with real data (1 iteration)Here's the best-case scenario for running LIME: 20 years of length composition, catch, and an abundance index, with a typical 200 samples of length measurements per year, assuming the life history information is known without error.
Here we specify C_type=2
because the catch data specified in data_all
in the Data input section was catch in weight, not numbers.
start <- Sys.time()
rich <- run_LIME(modpath=NULL, input=inputs_all, data_avail="Index_Catch_LC", C_type=2)
end <- Sys.time() - start end
The time difference above shows the run time of the data-rich case on the machine that compiled this document. The run time will vary depending on the machine, but hopefully this gives a ballpark of how long it takes to run a model with approximately 20 years and three data types.
The object rich
contains several elements of LIME output:
Inputs
: List of Data, Parameters, Map, and Random elements that are input directly to the LIME.cpp program. The goal is to help the user check the exact values and structure that went in to the LIME optimization using TMB. Report
: Report file of maximum likelihood estimates and derived values of population parameters from the LIME model run.Sdreport
: Standard errors of all estimated parameters and derived values of population parameters which are specified in the LIME.cpp file.obj
: the TMB objectopt
: the optimization results from TMBhelper::Optimizedf
: a data frame of final gradients for each estimated parameterinput
: input data list created using create_inputs
function. This is used when re-running the model using the function get_converged
.data_avail
: the data types available. This is used when re-running the model using the function get_converged
. General convergence criteria for LIME models includes:
We can check this for any model run using the following code:
gradient <- rich$opt$max_gradient <= 0.001 hessian <- rich$Sdreport$pdHess hessian==TRUE & gradient == TRUE
Because both the hessian and gradient checks passed, we assume that the program found the global minimum of the negative log likelihood surface and can move on to examine results.
The LIME
package includes functions to plot length composition model fits and other results.
LIME model fits to length frequency data:
plot_LCfits(Inputs=rich$Inputs, Report=rich$Report)
LIME estimates of key population parameters:
plot_output(Inputs=rich$Inputs, Report=rich$Report, Sdreport=rich$Sdreport, lh=lh, True=true, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("Fish"=c(0,1),"SPR"=c(0,1)))
This is the scenario for which LIME was developed: integrating multiple years of length data to account for time-varying fishing mortality and recruitment. The big change from the data-rich case in the code is to make sure data_avail="LC"
and C_type
is set to zero or removed from the function call (because the default is zero). We can still use the inputs_all
list that includes the index and catch data, but as long as data_avail="LC"
the model will not include the other data types.
start <- Sys.time()
lc_only <- run_LIME(modpath=NULL, input=inputs_all, data_avail="LC")
end <- Sys.time() - start end
Note the time difference is shorter than the data-rich case.
gradient <- lc_only$opt$max_gradient <= 0.001 hessian <- lc_only$Sdreport$pdHess hessian==TRUE & gradient == TRUE
We can use the same LF_df as was calculated in the data-rich case, but compare the LIME model fits from the model that excluded catch and abundance index:
plot_LCfits(LF_df=LF_df, Inputs=lc_only$Inputs, Report=lc_only$Report)
Model fits with length data only are of course much less accurate and with higher uncertainty than the data-rich case:
plot_output(Inputs=lc_only$Inputs, Report=lc_only$Report, Sdreport=lc_only$Sdreport, lh=lh, True=true, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("Fish"=c(0,0.5),"SPR"=c(0,1)))
The model meets convergence criteria but the fishing mortality rate estimates jump around between years more than is realistic. To attempt to curb this behavior, it is possible to decrease the standard deviation for fishing mortality which will serve to penalize estimates of fishing mortality that stray too far from the previous year.
inputs_all$SigmaF <- 0.1 lc_only2 <- run_LIME(modpath=NULL, input=inputs_all, data_avail="LC") gradient <- lc_only2$opt$max_gradient <= 0.001 hessian <- lc_only2$Sdreport$pdHess hessian==TRUE & gradient == TRUE
plot_output(Inputs=lc_only2$Inputs, Report=lc_only2$Report, Sdreport=lc_only2$Sdreport, lh=lh, True=true, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("Fish"=c(0,0.5),"SPR"=c(0,1)))
This is the scenario most similar to the ideas behind catch-curve stock reduction analysis (CCSRA). CCSRA used catch and age composition data to estimate stock status. LIME can use catch and length composition, which is much easier to collect than ages. With accurate assumptions about life history information, the combination of catch and length data can be very informative of stock status.
start <- Sys.time()
catch_lc <- run_LIME(modpath=NULL, input=inputs_all, data_avail="Catch_LC", C_type=2)
end <- Sys.time() - start end
gradient <- catch_lc$opt$max_gradient <= 0.001 hessian <- catch_lc$Sdreport$pdHess hessian==TRUE & gradient == TRUE
Model fits including catch data with length composition data:
plot_LCfits(LF_df=LF_df, Inputs=catch_lc$Inputs, Report=catch_lc$Report)
Including catch data improves estimates of population parameters over length data only.
plot_output(Inputs=catch_lc$Inputs, Report=catch_lc$Report, Sdreport=catch_lc$Sdreport, lh=lh, True=true, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("Fish"=c(0,0.5),"SPR"=c(0,1)))
There may be non-convergence issues (apparent from either a high final gradient or non-positive definite Hessian matrix). Some first checks in these cases are to 1) see if the Dirichlet-multinomial theta parameter is estimated too high, in which case it can be fixed to a high value and not estimated:
inputs_all$theta <- 50 check1 <- run_LIME(modpath=NULL, input=inputs_all, data_avail="LC", fix_more="log_theta")
Another option is to check if the recruitment standard deviation is estimated too high or too low, in which case it can be fixed at a reasonable value:
inputs_all$SigmaR <- 0.3 check2 <- run_LIME(modpath=NULL, input=inputs_all, data_avail="LC", fix_more="log_sigma_R")
Some length datasets may not be informative enough to tease apart fishing mortality, recruitment deviations, and selectivity. In these cases it may be convenient or helpful to fix selectivity at the starting values:
check3 <- run_LIME(modpath=NULL, input=inputs_all, data_avail="LC", est_selex_f=FALSE) plot_output(Inputs=check3$Inputs, Report=check3$Report, Sdreport=check3$Sdreport, lh=lh, True=true, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("Fish"=c(0,0.5),"SPR"=c(0,1)))
It is also possible to fix selectivity in a shape that is not the 2-parameter logistic curve. For example, we can set up dome-shaped selectivity using the 2-parameter logistic function on the left side of the fully selected lengths, and a normal distribution on the right side of the fully selected lengths:
## create input list with dome-shaped selectivity lh_dome <- create_lh_list(vbk=0.21, linf=65, t0=-0.01, lwa=0.0245, lwb=2.79, S50=c(20), S95=c(26), selex_input="length", selex_type=c("dome"), dome_sd=15, M50=34, maturity_input="length", M=0.27, binwidth=1, CVlen=0.1, SigmaR=0.737, SigmaF=0.2, SigmaC=0.1, SigmaI=0.1, R0=1, Frate=0.1, Fequil=0.25, qcoef=1e-5, start_ages=0, rho=0.43, theta=10, nseasons=1, nfleets=1) ## generate data with dome-shaped selectivity true_dome <- generate_data(modpath=NULL, itervec=1, Fdynamics=c("Endogenous"), Rdynamics="Constant", lh=lh_dome, Nyears=20, Nyears_comp=20, comp_sample=rep(200,20), init_depl=0.7, seed=123) ## create data list data_all <- list("years"=1:true_dome$Nyears, "LF"=true_dome$LF, "I_ft"=true_dome$I_ft, "C_ft"=true_dome$Cw_ft, "neff_ft"=true_dome$obs_per_year) inputs_dome <- create_inputs(lh=lh_dome, input_data=data_all)
plot(lh_dome$S_fl[1,], type="l", lwd=4, xlab="Length (cm)", ylab="Proportion selected to gear")
run_dome <- run_LIME(modpath=NULL, input=inputs_dome, data_avail="LC", vals_selex_ft=inputs_dome$S_fl) plot_output(Inputs=run_dome$Inputs, Report=run_dome$Report, Sdreport=run_dome$Sdreport, lh=lh, True=true, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("Fish"=c(0,0.5),"SPR"=c(0,1)))
With multiple fleets, all biological inputs are the same as with a single fleet. However, we must specify selectivity for each fleet. In this case, both selectivities are logistic.
lh_mf1 <- with(lh, create_lh_list(vbk=vbk, linf=linf, t0=t0, lwa=lwa, lwb=lwb, S50=c(20,30), S95=c(26,36), selex_input="length", selex_type=c("logistic","logistic"), M50=ML50, M95=NULL, maturity_input="length", M=M, h=h, binwidth=binwidth, CVlen=CVlen, SigmaR=SigmaR, SigmaF=SigmaF, SigmaC=SigmaC, SigmaI=SigmaI, R0=R0, Frate=Frate, qcoef=qcoef, start_ages=0, rho=rho, nseasons=1, nfleets=2))
Using the simulation function for multiple fleets, we need to specify the fleet dynamics for each (Fdynamics
), the number of years of length composition data for each (Nyears_comp
), and the proportions of each fleet (fleet_proportions
).
true_mf1 <- generate_data(modpath=NULL, itervec=1, Fdynamics=c("Constant","Endogenous"), Rdynamics="Constant", lh=lh_mf1, Nyears=20, Nyears_comp=c(20,10), comp_sample=rep(200, 20), init_depl=0.7, seed=123, fleet_proportions=c(0.6,0.4))
library(RColorBrewer) cols <- brewer.pal(3, "Set1") plot(lh_mf1$S_fl[1,], type="l", lwd=2, xlab="Length", ylab="Selectivity", col=cols[1]) lines(lh_mf1$S_fl[2,], type="l", lwd=2, col=cols[2])
par(mfrow=c(3,2)) plot(true_mf1$F_y, type="l", lwd=4, xlab="Year", ylab="Fishing mortality", ylim=c(0,max(true_mf1$F_y)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) for(i in 1:lh_mf1$nfleets){ lines(true_mf1$F_ft[i,], col=cols[i]) } plot(true_mf1$R_t, type="l", lwd=4, xlab="Year", ylab="Recruitment", ylim=c(0,max(true_mf1$R_t)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true_mf1$SPR_t, type="l", lwd=4, xlab="Year", ylab="SPR", ylim=c(0,1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(true_mf1$D_t, type="l", lwd=4, xlab="Year", ylab="Relative spawning biomass", ylim=c(0,max(c(1,max(true_mf1$D_t)))), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) plot(colSums(true_mf1$Cw_ft), type="l", lwd=4, xlab="Year", ylab="Catch", ylim=c(0,max(colSums(true_mf1$Cw_ft))*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5) for(i in 1:lh_mf1$nfleets){ lines(true_mf1$Cw_ft[i,], col=cols[i]) } plot(true_mf1$I_ft[1,], type="l", xlab="Year", ylab="Abundance index", ylim=c(0,max(true_mf1$I_ft)*1.1), xaxs="i", yaxs="i", cex.axis=1.5, cex.lab=1.5, col=cols[1]) lines(true_mf1$I_ft[2,], col=cols[2])
Move the generated data from an array to a data frame to plot the data.
LF_list <- lapply(1:true_mf1$nfleets, function(x){ return(true_mf1$LF[,,x]) }) LF_df <- LFreq_df(LF_list)
plot_LCfits(LF_df=LF_df)
data_mf1 <- list("years"=1:true_mf1$Nyears, "LF"=true_mf1$LF, "I_ft"=true_mf1$I_ft, "C_ft"=true_mf1$Cw_ft, "neff_ft"=true_mf1$obs_per_year) inputs_mf1 <- create_inputs(lh=lh_mf1, input_data=data_mf1)
For now, use the multinomial distribution for multiple fleets, specifying LFdist=0
.
rich_mf1 <- run_LIME(modpath=NULL, input=inputs_mf1, data_avail="Index_Catch_LC", C_type=2, LFdist=0)
Check that the data-rich model converges:
gradient <- rich_mf1$opt$max_gradient <= 0.001 hessian <- rich_mf1$Sdreport$pdHess hessian==TRUE & gradient == TRUE
plot_LCfits(Inputs=rich_mf1$Inputs, Report=rich_mf1$Report)
plot_output(Inputs=rich_mf1$Inputs, Report=rich_mf1$Report, Sdreport=rich_mf1$Sdreport, lh=lh_mf1, True=true_mf1, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("SPR"=c(0,1)))
With length data only, we need to estimate the total fishing mortality (as opposed to annual fishing mortality for each fleet). It's also useful to set the fishing mortality penalty to something relatively small to help the estimation.
inputs_mf1$SigmaF <- 0.1 lc_mf1 <- run_LIME(modpath=NULL, input=inputs_mf1, data_avail="LC", LFdist=0, est_totalF=TRUE) gradient <- lc_mf1$opt$max_gradient <= 0.001 hessian <- lc_mf1$Sdreport$pdHess hessian==TRUE & gradient == TRUE
plot_LCfits(Inputs=lc_mf1$Inputs, Report=lc_mf1$Report)
plot_output(Inputs=lc_mf1$Inputs, Report=lc_mf1$Report, Sdreport=lc_mf1$Sdreport, lh=lh_mf1, True=true_mf1, plot=c("Fish","Rec","SPR","ML","SB","Selex"), set_ylim=list("SPR"=c(0,1)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.