This document illustrates how to simulate some stock-recruitment data using the TESAsamSim
R package and then fit a linear state-space model (aka Dynamic Linear model) with potentially time-varying parameters to it. The model is fit via Maximum likelihood followed by Kalman filtering and smoothing using a wrapper for the dlm
R package (see here). The wrapper is based in part on code generously provided by Cody Szuwalski (NOAA). For this illustrative example the simulations are based on the population characteristics of Interior Fraser River coho salmon.
First install necessary packages and load Interior Fraser Coho data used to condition the simulations.
## Check if necessary packages are available and install if necessary listOfPackages <- c("here", "parallel", "doParallel", "foreach", "tidyverse", "tictoc", "TESAsamSim") newPackages <- listOfPackages[!(listOfPackages %in% installed.packages()[ , "Package"])] if (length(newPackages)) { install.packages(newPackages) } lapply(listOfPackages, require, character.only = TRUE) # load wrapper function to fit dynamic linear models source(here("dlm-wrapper.R")) ## Load relevant input data # CU-specific parameters cuPar <- read.csv(here("data", "IFCohoPars", "cohoCUpars.csv"), stringsAsFactors=F) # Stock-recruit and catch data that are used to populate the simulation priming/conditioning period srDat <- read.csv(here("data", "IFCohoPars", "cohoRecDatTrim.csv"), stringsAsFactors=F)
Next, simulate some stock-recruitment data based on the parameters specified in cuPar
and the scenarios specified in simPar
which include whether population productivity or capacity are stationary or time-varying. To do this use the genericRecoverySim()
function. For our illustrative purposes we only need one Monte Carlo trial, but the simulator needs to run at least two, and we'll do this separately for each scenario (row) in simPar
which are different combinations of stable or declining productivity and capacity.
# Simulation run parameters describing different scenarios simPar <- read.csv(here("data", "IFCohoPars", "cohoSimPar.csv"), stringsAsFactors = F) # Directory where simulated data will be saved scenNames <- unique(simPar$scenario) dirNames <- sapply(scenNames, function(x) paste(x, unique(simPar$species), sep = "_")) # loop through each scenario and simulate data for (i in 1:4){ genericRecoverySim(simPar=simPar[i,], cuPar=cuPar, srDat=srDat, variableCU=FALSE, ricPars=NULL, #ricPars, #cuCustomCorrMat = cuCustomCorrMat, dirName="example", nTrials=2, makeSubDirs=FALSE, random=FALSE)}
In this example the outputs of the simulations are separately stored in ./outputs/simData/example, as a .csv for each scenario. So let's load one of them and since the simulator generates simulated data for 5 populations (Conservation Units) and 2 Monte Carlo trials lets just grab the output from one of them.
simDataStationary <- readRDS(here("outputs", "simData", "example", "stationary_ref_CUsrDat.RData"))$srDatout # simDataStationary <- read.csv(here("outputs", "simData", "example", # "stationary_ref_CUsrDat.csv"), # stringsAsFactors = F) simDataStationaryCU1 <- simDataStationary %>% filter(CU==1 & iteration ==1)
Now we can try fitting a Dynamic Linear Model to the simulated stock-recruitment data. The model in this case is a linearized Ricker stock-recruit model with log(recruits/spawner) as the dependent variable, log(recruits/spawner) at low spawner abundance as the intercept (alpha), and a density dependence term as the slope (beta which is the inverse of capacity). To fit the model use a wrapper function we wrote for the workshop fitDLM()
which calls functions from the dlm
package. We can specify whether the model estimate alpha and/or beta modeled as dynamic latent states that follow a random walk with TRUE/FALSE statements in the function call.
For now we will set the model to estimate time-varying alpha alpha_vary=TRUE
but not beta beta_vary=FALSE
, even thought the simulated data are based on a stationary alpha and beta.
# rename a couple of columns to play nice with fitDLM. colnames(simDataStationaryCU1)[c(2,4,5,8,9)] <- c("byr","spwn", "rec", "alpha_true", "beta_true") # fit DLM dlm_model <- fitDLM(data = simDataStationaryCU1, alpha_vary = TRUE, beta_vary = FALSE)
The function fitDLM
returns a list with two elements. The first element results
is a data frame containing the original simulated data plus Kalman filtered and smoothed estimates of alpha and beta over time. The second element is the AICc of the model though it should be noted that we have some unanswered questions about whether this is calculated appropriately.
And then we can plot the simulated stock-recruitment data (with points color-coded by year) and compare the true and estimated values of alpha (intercept) and beta (slope).
plotDLM(dlm_model)
No bad, in this case the model does a good job recovering the true (non-time-varying) parameter estimates.
Now let see how the model does with a population that has time varying productivity (alpha) which increases threefold over the last 30 year of the simulation.
# first read in simulate data for scenario with increasing productivity # simDataIncreaseProd <- read.csv(here("outputs", "simData", "example", # "increaseProd_ref_CUsrDat.csv"), # stringsAsFactors = F) simDataIncreaseProd <- readRDS(here("outputs", "simData", "example", "increaseProd_ref_CUsrDat.RData"))$srDatout simDataIncreaseProdyCU1 <- simDataIncreaseProd %>% filter(CU==1 & iteration ==1) colnames(simDataIncreaseProdyCU1)[c(2,4,5,8,9)] <- c("byr","spwn", "rec", "alpha_true", "beta_true") # then fit DLM dlm_model <- fitDLM(data = simDataIncreaseProdyCU1, alpha_vary = TRUE, beta_vary = FALSE) # then plot plotDLM(dlm_model)
This is pretty decent as well. How about a scenario where only capacity (1/beta) changes over time and where we fit a model that only estimates time varying beta?
# first read in simulate data for scenario with increaseing productivity # simDataIncreaseCapacity <- read.csv(here("outputs", "simData", "example", # "increaseCapacity_ref_CUsrDat.csv"), # stringsAsFactors = F) simDataIncreaseCapacity <- readRDS(here("outputs", "simData", "example", "increaseCapacity_ref_CUsrDat.RData"))$srDatout simDataIncreaseCapacityCU1 <- simDataIncreaseCapacity %>% filter(CU==1 & iteration ==1) colnames(simDataIncreaseCapacityCU1)[c(2,4,5,8,9)] <- c("byr","spwn", "rec", "alpha_true", "beta_true") # then fit DLM dlm_model <- fitDLM(data = simDataIncreaseCapacityCU1, alpha_vary = FALSE, beta_vary = TRUE) # then plot plotDLM(dlm_model)
The model does well here too.
Lastly, how does the DLM do with a population that has time varying productivity (alpha) and capacity (1/beta)?
# first read in simulate data for scenario with increasing productivity and capacity # simDataIncreaseProdCapacity <- read.csv(here("outputs", "simData", "example", # "IncreaseProd&Capacity_ref_CUsrDat.csv"), # stringsAsFactors = F) simDataIncreaseProdCapacity <- readRDS(here("outputs", "simData", "example", "increaseProd&Capacity_ref_CUsrDat.RData"))$srDatout simDataIncreaseProdCapacityCU1 <- simDataIncreaseProdCapacity %>% filter(CU==1 & iteration ==1) colnames(simDataIncreaseProdCapacityCU1)[c(2,4,5,8,9)] <- c("byr","spwn", "rec", "alpha_true", "beta_true") # then fit DLM dlm_model <- fitDLM(data = simDataIncreaseProdCapacityCU1, alpha_vary = TRUE, beta_vary = TRUE) # then plot plotDLM(dlm_model)
Looks like the model has a hard time estimating both alpha and beta, at least for this particular simulated dataset, when both change the same magnitude and direction over the same time period.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.