knitr::opts_chunk$set(collapse = T, comment = "#>", warning=FALSE)
library(bbm)
set.seed(987)
# LOAD other packages, including bbm, here

Introduction

bbm is an open-source R package that provides an FLR implementation of the two-stage biomass-based model for the Bay of Biscay anchovy [@bbm_Ibaibarriaga2008]. This model describes the stock dynamics in terms of biomass and separates the population into two stages: recruits and adults. Thus, it has lower data demands than fully age-structured models, while it is able to track the main dynamics of the population with more detail than biomass dynamic models. Besides the application to the Bay of Biscay anchovy, similar models have been applied to other case studies [@Giannoulaki2014; @Gras2014; @Roel2000; @Roel2009].

The implementation available in this package estimates the model parameters by maximum likelihood through the TMB package [@tmb_Kristensen2016]. Additionally, the model has been generalised for an unlimited number of indices that can occur at different times of the year. The package uses the S4 classes and methods developed by the FLR project [http://flr-project.org/R; @FLR_KellMG2007].

This document explains the basic use of the package bbm. The package and documentation are available at http://flr-project.org/bbm.

Installation

The package requires the packages TMB and FLCore to be installed. TMB can be installed from CRAN using the install.packages() command, whereas FLCore can be installed from the FLR project repository:

  install.packages('TMB')
  install.packages(c("FLCore"), repos="http://flr-project.org/R")

An stable version of bbm can be installed from the FLR repository (http://flr-project.org/R) with the command:

  install.packages(c("bbm"), repos="http://flr-project.org/R")

A development version is available from GitHub repository (https://github.com/flr/bbm/).

    library(devtools)
    install_github('flr/bbm')

Once installed, the package can be loaded using:

    library(bbm)

Getting started: Bay of Biscay anchovy

The package contains data and additional objects required to run the Bay of Biscay anchovy example from @bbm_Ibaibarriaga2008. They can be loaded using:

  data(ane)

The dataset documentation can be consulted by using:

  ?ane

The data consist on four objects: catch.ane, indicesB.ane, indicesP.ane, control.ane and inits.ane. The first object, catch.ane, is an FLQuant with the Bay of Biscay anchovy catch in tonnes from 1987 to 2006 for the two age classes (recruits and adults) and two periods (before and after the spring surveys in mid-May). Note that the catch of the second period of the last year were not available in @bbm_Ibaibarriaga2008. However, the model fitting function does not allow any missing value in the catch.ane object, and the NA's were replaced by very small non-negative values so that the total catch was 0.001 and the age 1 proportion was 0.5.

  class(catch.ane)
  dim(catch.ane)
  catch.ane

The catch in tonnes per age class and period can be plotted as:

  xyplot(data~year|age+season, data=catch.ane, type="l", main="Total Catch (t)")

Let's define nyrs as the number of years:

  nyrs <- dim(catch.ane)[2]  
  nyrs
  years <- dimnames(catch.ane)$year

Then, we can plot the proportion of the recruits (age 1) in the catch for each of the periods:

  xyplot( data~year|season,
          data=FLQuants(period1=catch.ane[1,,,1,,]/quantSums(catch.ane[,,,1,,]),
                        period2=catch.ane[1,1:(nyrs-1),,2,,]/quantSums(catch.ane[,1:(nyrs-1),,2,,])),
          type="l", main="Catch proportion of recruits by period", ylab="")

The object indicesB.ane is of class FLIndices, which is a list of two elements of the FLIndex class. Each of them contains the data of the two spring surveys: BIOMAN DEPM survey conducted by AZTI and the PELGAS acoustic survey conducted by IFREMER. The index slot of each FLIndex object contains the total biomass estimates at the time of each survey.

  length(indicesB.ane)
  names(indicesB.ane)  
  lapply(indicesB.ane, index)

The object indicesP.ane is of class FLIndices, which is a list of two elements of the FLIndex class. Each of them contains the recruits proportion (in mass) estimates of the BIOMAN DEPM survey and the PELGAS acoustic survey.

  length(indicesP.ane)
  names(indicesP.ane)  
  lapply(indicesP.ane, index)

Besides the index slot, it is important to specify the timing of each index within the range slot of each FLIndex. In this case, both surveys are assumed to occur simultaneously at mid-May, so the start and end dates of each survey (startf and endf) are set equal to 0.375 = 5.5/12.

  lapply(indicesB.ane, range)
  lapply(indicesP.ane, range)

Each survey is assumed to occur at the middle of the start and end dates. The fraction of the year until that date can be computed as:

  findicesB.ane <- unlist(lapply(indicesB.ane, function(x) return(mean(range(x)[c('startf','endf')])))) 
  findicesB.ane
  findicesP.ane <- unlist(lapply(indicesP.ane, function(x) return(mean(range(x)[c('startf','endf')])))) 
  findicesP.ane

The timing of the surveys is important as it will define the number of periods within the year. The function periods returns a list with the number of periods (nper), the fraction of the year corresponding to each period (f) and a vector indicating the beginning of which period corresponds to each index. In this case the timings of the biomass and recruits indices define two periods within the year, that correspond to 0.375 and 0.675 fractions. In addition, the two biomass indices provide information at the beginning of the second period and the two recruits proportion indices also.

  per <- periods(findicesB=findicesB.ane, findicesP=findicesP.ane)
  per 

We can plot the total biomass from each index:

  dat <- FLQuants()
  for (i in 1:length(indicesB.ane)) dat[[i]] <- index(indicesB.ane[[i]])
  names(dat) <- names(indicesB.ane)
  xyplot( data~year|qname, data=dat,
          type="b", main="Total biomass by survey", ylab="Total biomass (t)")

And the age 1 biomass proportion from each index:

  dat <- FLQuants()
  for (i in 1:length(indicesP.ane)) dat[[i]] <- index(indicesP.ane[[i]])
  names(dat) <- names(indicesP.ane)
  xyplot( data~year|qname, data=dat,
          type="b", main="Recruits' biomass proportion by survey", ylab="")

The control.ane object is of class bbmControl.

class(control.ane)

It has two slots: g and param.fix.

slotNames(control.ane)

The slot g is a named vector that specifies the instantaneous rate of biomass decrease for each age class, which is the difference between the annual intrinsic growth and the natural mortality rates. In @bbm_Ibaibarriaga2008 the instantaneous biomass decrease was assumed to be age and time invariant and equal to 0.68.

control.ane@g

The second slot of the control.ane object is param.fix that is of class FLPar for all the parameters to be estimated by the model.

class(control.ane@param.fix)

Each element of the param.fix slot takes the value 0 if the parameter has to be estimated and takes the value 1 if the parameter is fixed to the initial value. In this first example all the parameters are estimated. Other variants will be illustrated later.

control.ane@param.fix

In other words, there are no fixed elements:

sum(control.ane@param.fix==1) # number of fixed parameters

The inits.ane object is of class FLPar and it contains initial values for all the model parameters.

class(inits.ane)
inits.ane

The initial parameters should provide biomasses large enough to support the level of observed catches. Given the instantaneous annual biomass decrease rates, the fraction of the year corresponding to each period, an FLQuant for the catches and an object of the class FLpar, the function calcPop calculates the resulting biomasses and checks that the resulting biomasses by age group are positive.

out <- calcPop(g=control.ane@g, f=per$f, catch=catch.ane, inits=inits.ane)
names(out)
out$ok
out$stock.bio

Given catch.ane, indices.ane, control.ane and inits.ane, the model is fitted simply by:

  run <- bbm(catch.ane, 
             indicesB=FLQuants(depm=index(indicesB.ane[[1]]), acoustic=index(indicesB.ane[[2]])), 
             indicesP=FLQuants(depm=index(indicesP.ane[[1]]), acoustic=index(indicesP.ane[[2]])),         
             findicesB=findicesB.ane, 
             findicesP=findicesP.ane,
             control=control.ane, inits=inits.ane)

The output object is of class bbmFit.

  class(run)

And it has the following slots:

  slotNames(run)

The convergence should be checked:

  run@convergence

The fitted model parameters and their corresponding standard errors can be extracted directly using the accessors:

  params(run)
  params.se(run)
  vcov(run)

The value of the log likelihood, the AIC and BIC from the fitted object can be obtained by:

  logLik(run)
  AIC(run)
  BIC(run)

We can access the stock biomasses and the fitted indices from the output object:

   stock.bio(run)
   indicesB(run)
   indicesP(run)

The fitted object can be plotted:

   plot(run) 

The Pearson residuals can be obtained by:

  res <- residuals(run)
  res

The output object is of class bbmFitresiduals.

  class(res)

The class bbmFitresiduals has one slot for the residuals of the indices in biomass (residuals.B) and another one for the percentage of recruits (residuals.P). Both slots are of class FLQuants, with one element per survey index.

  slotNames(res)
  res@residuals.B$depm

Then, we can plot the residuals to check that there are no patterns:

   plot(res) 
   #qqmath(res)

Starting from different initial values

The initial values for the optimization could be set by hand or calculated automatically using the function createInits in the package as we will show later. If the optimization works properly the results should be independent of the initial values of the optimization.

In order to create our own initial values, we first generate an empty object with the correct parameter names by using the function bbmFLPar and we then fill it directly with the selected values:

inits.ane2 <- bbmFLPar( years=dimnames(catch.ane)$year, namesB=names(indicesB.ane), namesP=names(indicesP.ane),
                        niter=dim(catch.ane)[6])

inits.ane2[] <- c( 0.6, 0.6, 100, 100, 3, 3, 60000, rep(40000, nyrs), 10, 2)
inits.ane2

Alternatively, initial values can be generated automatically from the dat using the function createInits:

inits.ane3 <- createInits( catch.ane, indicesB=indicesB.ane, indicesP=indicesP.ane, 
                           g=control.ane@g)
inits.ane3

Then, we fit the model starting from different initial values:

  run1 <- bbm(catch.ane, indicesB=indicesB.ane, indicesP=indicesP.ane, control=control.ane, inits=inits.ane)
  run2 <- bbm(catch.ane, indicesB=indicesB.ane, indicesP=indicesP.ane, control=control.ane, inits=inits.ane2)
  run3 <- bbm(catch.ane, indicesB=indicesB.ane, indicesP=indicesP.ane, control=control.ane, inits=inits.ane3)

We can compare the fitted parameters:

  params(run1)
  params(run2)
  params(run3)

And the corresponding AIC values:

  AIC(run1)
  AIC(run2)
  AIC(run3)

The time series of estimated recruits can be plotted:

parnames <- sapply(dimnames(run1@params)$params, function(x) unlist(strsplit(x,split="_"))[1])
dat <- cbind( run1=c(params(run1)[parnames %in% "R"]), run2=c(params(run2)[parnames %in% "R"]), 
              run3=c(params(run3)[parnames %in% "R"]))
rownames(dat) <- years

matplot( dat, type="l", ylab="R (t)", xlab="year", lty=1, col=c('black','red','green'), xaxt = "n")
axis(1, at=1:nyrs, labels=years)
legend( "topright", paste("run",1:3,sep=""), lty=1:3, col=c('black','red','green'), bty="n")

Fixing some parameters

Any of the model parameterscan be fixed by setting its param.fix value equal to 1. Then, the parameter will be fixed to the initial value and won't be estimated.

We fix the catchability of the biomass estimate from the depm survey, as follows:

param.fix <- bbmFLPar( 0, years=dimnames(catch.ane)$year, niter=dim(catch.ane)[6], 
                       namesB=names(indicesB.ane), namesP=names(indicesP.ane)) 
param.fix['q_depm'] <- 1

control.ane2 <- bbmControl(g=c(rec=0.68, adult=0.68), param.fix=param.fix)
  run4 <- bbm( catch.ane, indicesB=indicesB.ane, indicesP=indicesP.ane, 
               control=control.ane2, inits=inits.ane)

The estimated parameters and their corresponding standard errors are:

  params(run4)
  params.se(run4)

Where, values for the q_depm parameter are:

  inits.ane$q_depm
  run4@params$q_depm
  run4@params.se$q_depm

Simulated data set

The package also includes a function to simulate data. Say, that we want to create biomass and recruits proportion indices from a survey conducted at mid-year. So the year is separated into two seasons. First, we need an FLQuant with the catch in biomass for the recruits and the adults in each of the seasons. Starting from the landings of ple4, we can generate it as follows:

data(ple4)

aux <- landings.n(ple4)*landings.wt(ple4)
catch.ple4 <- FLQuant(NA, dim=c(2, dim(landings.n(ple4))[2], 1, 2, 1, 1), dimnames=list(year=dimnames(landings.n(ple4))$year))
catch.ple4[1, , ,1:2, ,] <- aux[1,]/2
catch.ple4[2, , ,1:2, , ] <- quantSums(aux[-1,])/2
catch.ple4

We assume that the true values of the model parameters we are going to use to simulate are:

rr <- rlnorm(dim(catch.ple4)[2], log(300000), 1/sqrt(3))
par.ple4 <- bbmFLPar(years=dimnames(catch.ple4)$year, namesB=c("Mysurvey"), namesP=c("Mysurvey"), niter=1)
par.ple4[] <- c(1, 200, 4, 200000, rr, log(250000), 3)
par.ple4

And the true values of the biomass change rates for recruits and adults are:

g <- c(rec=0.3, adult=0.25)

We can compute the true values of recruits and adults biomass at the beginning of each season and check that these values are large enough to support the observed levels of catches (i.e. there are no negative biomasses):

pop.ple4 <- calcPop(g=g, f=c(0.5, 0.5), catch=catch.ple4, inits=par.ple4)
pop.ple4

The observed indices of total biomass and recruits biomass proportion of a survey named "Mysurvey" conduceted at the middle of the year are generated as follows:

indices.ple4 <- simIndices( catch.ple4, g=g, inits=par.ple4, 
                         findicesB=c(Mysurvey=0.5), findicesP=c(Mysurvey=0.5) )

The resulting object is a list of two FLIndices: one for the total biomass indices and one for the recruits biomass proportions:

length(indices.ple4)
names(indices.ple4)

lapply(indices.ple4$Btot, index)
lapply(indices.ple4$Btot, range)
lapply(indices.ple4$Prec, index)
lapply(indices.ple4$Prec, range)

Now, we can prepare the elements necessary to fit the model to these indices. Basically, we need a bbmControl object and an FLPar with initial parameters:

param.fix <- par.ple4
param.fix[] <- 0 # dummy FLPar indicating which parameters are fixed (0 estimated and 1 fixed)

control.ple4 <- new( "bbmControl", g=g, param.fix=param.fix)  # bbmControl. We assumed g is known exactly 

inits.ple4 <- createInits(catch.ple4, indicesB=indices.ple4$Btot, indicesP=indices.ple4$Prec, g=g) # create automatic initial parameters

Then, the model is fitted as follows:

fit.ple4 <- bbm(catch.ple4, indicesB=indices.ple4$Btot, indicesP=indices.ple4$Prec, 
                control=control.ple4, inits=inits.ple4)

We can check the results

params(fit.ple4)
params.se(fit.ple4)
logLik(fit.ple4)
AIC(fit.ple4)
BIC(fit.ple4)
(params(fit.ple4) - par.ple4)/par.ple4
res <- residuals(fit.ple4)
plot(res)
qqmath(res)

More information

    library(devtools)
    #install_github('flr/bbm')

Software Versions

Authors information

Leire Ibaibarriaga. AZTI-Tecnalia. Txatxarramendi Ugartea z/g, E-48395 Sukarrieta (Bizkaia) Spain.

Sonia Sanchez. AZTI-Tecnalia. Herrera Kaia Portualdea z/g, E-20110 Pasaia (Gipuzkoa) Spain.

References



flr/bbm documentation built on Sept. 6, 2022, 8:56 p.m.