knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE) library(here) library(tidyverse) library(atlantisom) library(ggthemes) library(FSA)
Here we use existing Atlantis ecosystem model output to generate input datasets for a variety of simpler population models, so that the performance of these models can be evaluated against known (simulated) ecosystem dynamics. Atlantis models simulate a wide range of physical and ecological processes, include a full food web, and can be run using different climate forcing, fishing, and other scenarios.
We extract simulated data using the R package atlantisom. The purpose of atlantisom is to use existing Atlantis model output to generate input datasets for a variety of models, so that the performance of these models can be evaluated against known (simulated) ecosystem dynamics. Atlantis models can be run using different climate forcing, fishing, and other scenarios. Users of atlantisom will be able to specify fishery independent and fishery dependent sampling in space and time, as well as species-specific catchability, selectivty, and other observation processes for any Atlantis scenario. Internally consistent multispecies and ecosystem datasets with known observation error characteristics will be the atlantisom outputs, for use in individual model performance testing, comparing performance of alternative models, and performance testing of model ensembles against "true" Atlantis outputs.
Our initial species selection includes 11 single species groups from the Atlantis NOBA model.
#fgs <- load_fgs(here("simulated-data/atlantisoutput/NOBA_March_2020"), "nordic_groups_v04.csv") lname <- data.frame(Name= c("Long_rough_dab", "Green_halibut", "Mackerel", "Haddock", "Saithe", "Redfish", "Blue_whiting", "Norwegian_ssh", "North_atl_cod", "Polar_cod", "Capelin"), Long.Name = c("Long rogh dab", "Greenland halibut", "Mackerel", "Haddock", "Saithe", "Redfish", "Blue whiting", "Norwegian spring spawning herring", "Northeast Atlantic Cod", "Polar cod", "Capelin"), Latin = c("*Hippoglossoides platessoides*", "*Reinhardtius hippoglossoides*", "*Scomber scombrus*", "*Melongrammus aeglefinus*", "*Pollachius virens*", "*Sebastes mentella*", "*Micromesistius poutassou*", "*Clupea harengus*", "*Gadus morhua*", "*Boreogadus saida*", "*Mallotus villosus*"), Code = c("LRD", "GRH", "MAC", "HAD", "SAI", "RED", "BWH", "SSH", "NCO", "PCO", "CAP") ) # sppsubset <- merge(fgs, lname, all.y = TRUE) # spptable <- sppsubset %>% # arrange(Index) %>% # select(Name, Long.Name, Latin) spptable <- lname %>% select(Name, Long.Name, Latin) knitr::kable(spptable, col.names = c("Model name", "Full name", "Latin name"))
Below we will double check survey census outputs against the true values, and then test some survey specifications that may be reasonable for multispecies model testing. Finally we will apply survey functions to the detailed diet data.
NOBA2config.R
specifies location and names of files needed for atlantisom to initialize:
omdimensions.R
standardizes timesteps, etc. (this is part of atlantisom and should not need to be changed by the user):
mssurvey_spring.R
and mssurvey_fall.R
configure the fishery independent surveys (in this census test, surveys sample all model polygons in all years and have efficiency of 1 for all species, with no size selectivity):
msfishery.R
configures the fishery dependent data:
True datasets are generated as follows, using atlantisom
wrapper functions om_init
to assemble initial true atlantis data, om_species
to subset true data for desired species, om_index
to generate survey biomass and total catch biomass indices, om_comps
to generate age and length compositions and average weight at age from surveys and fisheries, and om_diet
to generate diet from surveys. Outputs are saved to the atlantisoutput
folder (not kept on github due to size):
NOBAom <- om_init(here("simulated-data/config/NOBA2Config.R")) NOBAom_ms <- om_species(sppsubset$Name, NOBAom) #need to change internal call to source in atlantisom om_index om_comps and om_diet functions #expecting a config folder in same directory as rmd #this is a workaround dir.create(file.path(here("docs/config"))) file.copy(here("simulated-data/config/omdimensions.R"), here("docs/config/omdimensions.R")) NOBAom_ms_ind <- om_index(usersurvey = c(here("simulated-data/config/mssurvey_spring.R"), here("simulated-data/config/mssurvey_fall.R")), userfishery = here("simulated-data/config/msfishery.R"), omlist_ss = NOBAom_ms, n_reps = 1, save = TRUE) NOBAom_ms_comp <- om_comps(usersurvey = c(here("simulated-data/config/mssurvey_spring.R"), here("simulated-data/config/mssurvey_fall.R")), userfishery = here("simulated-data/config/msfishery.R"), omlist_ss = NOBAom_ms, n_reps = 1, save = TRUE) NOBAom_ms_diet <- om_diet(config = here("simulated-data/config", "NOBA2config.R"), dietfile = "NOBADetDiet.gz", usersurvey = c(here("simulated-data/config/mssurvey_spring.R"), here("simulated-data/config/mssurvey_fall.R")), omlist_ss = NOBAom_ms, n_reps = 1, save = TRUE) unlink(here("docs/config"), recursive = TRUE)
Our goal was to have a reproducible process for all aspects of data generation through model inputs. This section gives examples of how input datasets were built. Now that the simulated data is included in the mskeyrun
data package, data inputs are derived from those sources.
Example input files are in the model-input-files
directory. First we write the fairly simple biomass and catch inputs for the multispecies surplus production model MSSPM from our census survey outputs.
o.name <- here("simulated-data/msinput/msspm/") stepperyr <- if(NOBAom_ms$runpar$outputstepunit=="days") 365/NOBAom_ms$runpar$toutinc #read in survey biomass data survObsBiom <- read_savedsurvs(d.name, 'survB') #reads in all surveys #multiple surveys named in list object, use as filename for(s in names(survObsBiom)){ #arrange into wide format: year, Species1, Species2 ... and write csv svbio <- survObsBiom[[s]][[1]] %>% #calc_timestep2time(d.name, ., run.prm.file) %>% #this gives an arbitrary start year of 1970 dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, year, atoutput) %>% tidyr::spread(species, atoutput) %>% write_csv(paste0(o.name,s,".csv")) } #read in catch data read_savedfisheries <- function(dir, type){ datlook <- data.frame(dattype = c('Catch', 'catchAge', 'catchLen', 'catchWtage', 'catchAnnAge', 'catchAnnWtage'), pattern = c("*fishCatch.rds", "*fishObsAgeComp.rds", "*fishObsLenComp.rds", "*fishObsWtAtAge.rds", "*fishObsFullAgeComp.rds", "*fishObsFullWtAtAge.rds")) fisheries <- list.files(path=dir, pattern = as.character(datlook$pattern[datlook$dattype %in% type]), full.names = TRUE) fishery.name <- str_match(fisheries, paste0(scenario.name,"_\\s*(.*?)\\s",datlook$pattern[datlook$dattype==type]))[,2] dat <- lapply(fisheries, readRDS) names(dat) <- fishery.name return(dat) } catchbio_ss <- read_savedfisheries(d.name, 'Catch') for(c in names(catchbio_ss)){ #arrange into wide format: year, Species1, Species2 ... and write csv catchbio <- catchbio_ss[[c]][[1]] %>% dplyr::mutate(year = time/365) %>% dplyr::select(species, year, atoutput) %>% tidyr::spread(species, atoutput) %>% write_csv(paste0(o.name,"fishery_",c,".csv")) } #write readme for metadata meta <- c("Simulated data from Norweigan-Barents Atlantis Model by Hansen et al. \n\n", paste("Atlantis outputs stored in", d.name, '\n\n'), paste("Biomass output for surveys", names(survObsBiom), '\n\n'), paste("Fishery catch output for fisheries", names(catchbio_ss), '\n\n'), "Biomass and catch output units are annual tons.\n\n", "Survey specifications:\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_fall.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_spring.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_fall_01.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_spring_01.R")), "\n\n", "Fishery specifications:\n", "----------------------\n", read_file(here("/simulated-data/config/msfishery.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/msfishery_01.R")), "\n" ) cat(as.character(meta), file=paste0(here("simulated-data/msinput/msspm/"),"README.txt"))
The MSSPM needs diet compostion information as well. Here we supply two survey time series of diet data, one census and a trial survey dataset with more typical survey coverage issues.
Diet composition is by predator age class for each time (unit=time.days) and is a proportion (sums to 1 for each time). We get species diet composition by aggregating over age classes (weight diet composition at age by true biomass at age).
source(here("simulated-data/config/NOBA2Config.R")) all_diets <- read_savedsurvs(d.name, 'survDiet') # divide time.days by runpar$toutinc to match time in biomass # aggregate over ages using true biomass at age (survey diet is in proportion by age) NOBAom_ms <- readRDS(file.path(d.name, paste0(scenario.name, "omlist_ss.rds"))) trueB_age_allbox <- NOBAom_ms$truebio_ss %>% #sum over polygons #aggregate(dat$atoutput,list(dat$species,dat$agecl,dat$time, dat$prey),sum) group_by(species,agecl,time) %>% summarise(trueBatage = sum(atoutput)) trueB <- trueB_age_allbox %>% group_by(species, time) %>% summarise(trueB = sum(trueBatage)) trueBagecl <- merge(trueB_age_allbox, trueB) #multiple surveys named in list object, use as filename for(s in names(all_diets)){ #arrange into wide format: year, Species1, Species2 ... and write csv svdiet <- all_diets[[s]][[1]] %>% #calc_timestep2time(d.name, ., run.prm.file) %>% #this gives an arbitrary start year of 1970 dplyr::mutate(time = time.days/NOBAom_ms$runpar$toutinc) %>% dplyr::left_join(., trueBagecl) %>% dplyr::mutate(dietBwt = dietSamp*trueBatage) %>% dplyr::group_by(species,time,prey,trueB) %>% dplyr::summarise(totdietBwt = sum(dietBwt)) %>% dplyr::mutate(totdietComp = totdietBwt/trueB) %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::ungroup() %>% dplyr::select(year, species, prey, totdietComp) %>% write_csv(paste0(o.name,"diet_",s,".csv")) }
Note that this is the old method preserved here for history, and now Hydra inputs are coming directly from mskeyrun
. Much of this code formed the basis for functions that build mskeyrun simulated data objects, now in mskeyrun\data-raw\R
For the length-based model hydra, we can write .csv files to be used as input in the hydradata package, which writes .dat and .pin files.
Hydra needs survey biomass time series, catch time series, survey length composition, fishery length composition, and diet composition data.
Observations for biomass and catch are very similar to those created for MMSPM. Hydra inputs will need to be in the same species order for each dataset, so first we make a species file and some of the other input data that can come straight from Atlantis.
# foodweb and species file identify predators and prey; take from true atlantis diet comp diettrue <- load_diet_comp(here("simulated-data/atlantisoutput/NOBA_march_2020"), "nordic_runresults_01DietCheck.txt", fgs = fgs) ms_diet <- diettrue %>% filter(species %in% sppsubset$Name) %>% filter(atoutput>0) %>% mutate(species = as.character(species), prey = as.character(prey)) predPrey <- ms_diet %>% filter(prey %in% unique(species)) %>% select(species, agecl, time.days, prey, atoutput) %>% group_by(species, prey) %>% summarise(avgpreyprop = mean(atoutput)) foodweb <- left_join(sppsubset, predPrey, by=c("Name" = "species")) %>% select(species, prey, avgpreyprop) %>% mutate(avgpreyprop = ceiling(avgpreyprop)) %>% spread(species, avgpreyprop) %>% filter(!is.na(prey)) %>% replace(is.na(.), 0) predOrPrey <- ifelse(colSums(foodweb[-1])>0, 1, 0) specieslist <- data.frame(species = unique(unique(survObsBiom[[2]][[1]]$species)), guild = rep("na"), guildmember = c(1:11), predOrPrey = predOrPrey ) # take lenwt parameters from true atlantis pars lenwtpar <- left_join(sppsubset, NOBAom_ms$biol$wl, by=c("Code" = "group")) %>% select(Name, a, b) %>% arrange(Name) # fixed parameters: # intake same for all teleosts intakespp <- rbind(species = sort.default(sppsubset$Name), alpha = rep(0.004), beta = rep(0.11)) colnames(intakespp) <- intakespp[1,] intakespp <- as.data.frame(intakespp[-1,]) # predator size preference ratio same for all species sizepref <- rbind(species = sort.default(sppsubset$Name), mu = rep(0.5), sigma = rep(2)) colnames(sizepref) <- sizepref[1,] sizepref <- as.data.frame(sizepref[-1,]) # dimensions for singletons file singletons <- data.frame( debug=3, Nyrs=NOBAom_ms$runpar$nyears, #may want to take from survey time series instead Nspecies=length(sppsubset$Name), Nsizebins=5, Nareas=1, Nfleets=length(unique(NOBAom_ms$truecatchage_ss$fleet)), Nsurveys=2, wtconv=1, yr1Nphase=0, recphase=0, avg_rec_phase=0, avg_F_phase=0, dev_rec_phase=0, dev_F_phase=0, fqphase=0, sqphase=0, ssig_phase=0, csig_phase=0, phimax=1, scaleInitialN=1, otherFood=0000, LFI_size=60, bandwidth_metric=5, assessmentPeriod=3, flagMSE=0, flagLinearRamp=1, baseline_threshold=0.2 ) # M1 same for all species, depends on n length bins # to get annual M1 of 0.1, this matrix needs to be M1 per timestep # right now input works for 5 timesteps per year, but should be # done in the code with input annual M1 in case timestep changes M1mat <- matrix(0.02, singletons$Nspecies, singletons$Nsizebins, dimnames = list(c(sort.default(sppsubset$Name)), c(paste0("sizeclass",seq(1:singletons$Nsizebins))))) M1 <- as.data.frame(M1mat) # sexratio fixed 50:50 for all species sexratio <- matrix(0.5, 1, singletons$Nspecies, dimnames = list("ratio", c(sort.default(sppsubset$Name)))) # placeholders for unused covariates dimenstioned to the number of simulated years dummycovar <- data.frame(dummy1 = rep(1, NOBAom_ms$runpar$nyears)) # placeholder for effort data dimensioned by year and nfleets allfleets <- atlantisom::load_fisheries(d.name, fisheries.file) fleets <- unique(NOBAom_ms$truecatchage_ss$fleet) dummyeffort <- data.frame(1:singletons$Nyrs, matrix(1, singletons$Nyrs, singletons$Nfleets)) names(dummyeffort) <- c("year", fleets) # will need to fit von B and maturity curve functions to back out hydra input pars # maturity from maturity ogive input in biol file # growth functions from length at age in model... # write the csvs o.name <- here("simulated-data/msinput/hydra/") write_csv(foodweb, paste0(o.name,"foodweb_NOBA.csv")) write_csv(specieslist, paste0(o.name,"speciesList_NOBA.csv")) write_csv(lenwtpar, paste0(o.name,"lengthweight_species_NOBA.csv")) write.csv(intakespp, paste0(o.name,"intake_species_NOBA.csv")) write.csv(sizepref, paste0(o.name,"mortality_M2sizePreference_NOBA.csv")) write.table(t(singletons), paste0(o.name,"singletons_NOBA.csv"), sep=",", col.names=FALSE) write.csv(M1, paste0(o.name,"mortality_M1_NOBA.csv")) write_csv(dummycovar, paste0(o.name,"recruitment_covariates_NOBA.csv")) write_csv(dummycovar, paste0(o.name,"maturity_covariates_NOBA.csv")) write_csv(dummycovar, paste0(o.name,"growth_covariates_NOBA.csv")) write_csv(dummyeffort, paste0(o.name,"observation_effort_NOBA.csv")) write.csv(sexratio, paste0(o.name,"sexratio_NOBA.csv"))
Decisions on size bins for hydra: keep 5 bins per species for now, build in flexibility to allocate size data to any number of bins with any definitions.
Species specific bin width definitions are an input into the model, with more refined (smaller) bin sizes during rapid growth phases.
Initially, we'll directly transfer bin widths for GB and NOBA common species:
We will substitute bin widths from these similar taxa/size GB species for NOBA:
We will modify or subsitute bin widths for the rest of the NOBA species based on size:
Code to make sizebins csv for input; hardcoded based on these decisions.
# for reference, GB hydra_sim size bins # sizeclass1,sizeclass2,sizeclass3,sizeclass4,sizeclass5,commentField-notused # 20,20,20,20,30,spinydog # 20,20,20,20,40,winterskate # 5,5,5,5,20,Aherring # 20,20,20,40,50,Acod # 10,10,20,20,20,haddock # 10,10,10,10,20,yellowtailfl # 10,10,10,10,10,winterfl # 10,10,10,10,10,Amackerel # 10,10,10,10,30,silverhake # 20,20,20,30,40,goosefish sizebinmat <- matrix(0, singletons$Nspecies, singletons$Nsizebins, dimnames = list(c(sort.default(sppsubset$Name)), c(paste0("sizeclass",seq(1:singletons$Nsizebins))))) sizebinmat["Blue_whiting",] <- c(10,10,10,10,30) sizebinmat["Capelin",] <- c(5,5,5,5,5) sizebinmat["Green_halibut",] <- c(20,20,20,40,50) sizebinmat["Haddock",] <- c(10,10,20,20,20) sizebinmat["Long_rough_dab",] <- c(10,10,10,10,20) sizebinmat["Mackerel",] <- c(10,10,10,10,10) sizebinmat["North_atl_cod",] <- c(20,20,20,40,50) sizebinmat["Norwegian_ssh",] <- c(5,5,5,5,20) sizebinmat["Polar_cod",] <- c(5,5,5,5,5) sizebinmat["Redfish",] <- c(10,10,10,10,10) sizebinmat["Saithe",] <- c(20,20,20,30,40) sizebins <- as.data.frame(sizebinmat) %>% mutate('commentField-notused' = rownames(sizebinmat)) write_csv(sizebins, paste0(o.name,"length_sizebins_NOBA.csv"))
These decisions on matching species can carry through to other life history parameters, at least initially.
Need to fit growth and maturity parameters from Atlantis output for input into hydra. Fit growth using code below, based on the package FSA [@ogle-fsa-2021]. The package provides multiple ways to get starting values for the R nls
function for fitting nonlinear models. It was necessary to try several different methods for starting values to get at least one von Bertalannfy growth parameter set for each species. A scratch code block shows some of the experiments to get the fitting working that resulted in the function to fit for all surveys.
Note that spring surveys don't get mackerel at all (not in the model domain then). A full set of von B parameters could only be estimated for the first era of both versions of the fall survey tested here.
sp_age <- NOBAom_ms$funct.group_ss[, c("Name", "NumCohorts", "NumAgeClassSize")] sp_age <- sp_age %>% mutate(n_annages = NumCohorts * NumAgeClassSize) stepperyr <- if(NOBAom_ms$runpar$outputstepunit=="days") 365/NOBAom_ms$runpar$toutinc # need growth, vonB t0, K, and Linf and simple growth function # would you really base this on 140 years of data? I used all survey years len_comp_data <- read_savedsurvs(d.name, 'survLen') # need to shift agecl to represent midpoint of actual age groups # using part of the wtage interpolation function calc_avgwtstage2age() annage_comp_data <- read_savedsurvs(d.name, 'survAnnAge') vb1 <- vbFuns() # VBGF packages all take age and length pairs, reformat from length comp at agecl # break into 3 chunks of time, "era" 48 years each more like our survey # subsample after reformatting with uncount because we wouldn't have this many ages #multiple surveys named in list object, use as filename for(s in names(len_comp_data)){ annages <- annage_comp_data[[s]][[1]] %>% group_by(species, time, agecl) %>% rename(trueage = agecl) %>% summarise(truenatage = sum(atoutput)) # Figure out the groups that have multiple ages classes in each stage (or # cohort) multiple_ages <- sp_age[sp_age$NumAgeClassSize>1,] # finds the average age of each agecl each timestep based on truenatage svlen_avgage <- annages %>% dplyr::semi_join(len_comp_data[[s]][[1]], by = c("species", "time")) %>% dplyr::inner_join(multiple_ages, by = c("species" = "Name")) %>% dplyr::mutate(agecl = as.integer(ceiling(trueage/NumAgeClassSize))) %>% dplyr::group_by(species, time, agecl) %>% dplyr::mutate(avgage = weighted.mean(trueage, truenatage)) #arrange into long format: year, Species, agecl, length svlenage <- len_comp_data[[s]][[1]] %>% dplyr::left_join(svlen_avgage) %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, agecl, avgage, year, upper.bins, atoutput) %>% dplyr::mutate(avgage = coalesce(avgage,agecl)) %>% dplyr::rename(length = upper.bins) %>% dplyr::mutate(era = as.numeric(cut_number(year,3))) %>% tidyr::uncount(atoutput) %>% dplyr::group_by(species, avgage, year) %>% dplyr::sample_frac(0.01) saveRDS(svlenage, file.path(d.name, paste0(scenario.name, "_", s, "agelensamp.rds"))) # } # # # read saved sample from above # survs <- list.files(path=d.name, pattern = "agelensamp.rds", full.names = TRUE) # # survname <- str_match(survs, paste0(scenario.name,"_\\s*(.*?)\\s","*agelensamp.rds"))[,2] # # agelensamp_data <- lapply(survs, readRDS) # # names(agelensamp_data) <- survname # # # for(s in names(agelensamp_data)){ # try different start points to see if we can get more species fit # this one goes to 11! # need because group_map doesn't keep group names gk <- svlenage %>% group_by(species, era) %>% group_keys() svvbgf3.2 <- svlenage %>% dplyr::group_by(species, era) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .,methLinf="longFish")))) #this gets exponential growth pars, probably won't use this growth mod though svexp3 <- svlenage %>% dplyr::group_by(species, era) %>% group_modify(~ broom::tidy(lm(log(length) ~ log(avgage), data = .x))) %>% select(species, era, term, estimate) %>% pivot_wider(names_from = "term", values_from = "estimate") %>% mutate(psi = exp(`(Intercept)`)) %>% rename(kappa = `log(avgage)`) %>% select(species, era, kappa, psi) vbout.2 <- data.frame() for(i in 1:length(svvbgf3.2)) { if(is.list(svvbgf3.2[[i]])) { spe <- cbind(as.data.frame(gk)[i,], as.data.frame(t(coef(svvbgf3.2[[i]])))) vbout.2 <- rbind(vbout.2, spe) } } growthpars <- left_join(svexp3, vbout.2) %>% mutate(growthType = case_when( is.na(Linf) ~ 2, TRUE ~ 4)) # vonB default, exponential if no vonB # sadly doesn't work so just make a csv for each era and use the one with most spp # maxera <- growthpars %>% # filter(!is.na(Linf)) %>% # group_by(era) %>% # tally() %>% # filter(n==1) %>% # select(era) growthspp <- growthpars %>% select(species, psi, kappa, Linf, K, growthType) %>% pivot_longer(psi:growthType, names_to = "par") %>% pivot_wider(names_from = species, values_from = value) growthspp.era <- split(growthspp, list(growthspp$era)) for(era in names(growthspp.era)) { gpar <- growthspp.era[[era]] %>% ungroup()%>% select(-era) %>% remove_rownames() %>% column_to_rownames(var="par") %>% write.csv(paste0(o.name,"growth_species_NOBA_era",era,"_",s,".csv")) } }
This code block has experiments with different fits and different ways to splice the length at age data.
capelinvbgf <- svlenage %>% filter(species == "Capelin") %>% group_by(era) %>% group_map(try(~nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .)))) capelinall <- svlenage %>% filter(species == "Capelin") plot(length~avgage,data=capelinall,pch=19,col=rgb(0,0,0,1/5)) curve(vb1(x,Linf=coef(capelinvbgf[[1]])),from=0,to=5,col="red",lwd=2,add=TRUE) curve(vb1(x,Linf=coef(capelinvbgf[[2]])),from=0,to=5,col="blue",lwd=2,add=TRUE) curve(vb1(x,Linf=coef(capelinvbgf[[3]])),from=0,to=5,col="green",lwd=2,add=TRUE) # herringvbgf <- svlenage %>% # filter(species == "Norweigan_ssh", era==1) %>% # group_map(try(~nls(length~vb1(avgage, Linf, K, t0), data = ., # start = vbStarts(length~avgage,data = .)))) #Error in nls(length ~ vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length ~ : #singular gradient with group_map, era 2 and 3 codvbgf <- svlenage %>% filter(species == "North_atl_cod") #group_by(era) %>% codfit <- nls(length~vb1(avgage, Linf, K, t0), data = codvbgf, start = vbStarts(length~avgage,data = codvbgf)) plot(length~avgage,data=codvbgf,pch=19,col=rgb(0,0,0,1/5)) curve(vb1(x,Linf=coef(codfit)),from=0,to=20,col="red",lwd=2,add=TRUE) # era==1 Nonlinear regression model # model: length ~ vb1(avgage, Linf, K, t0) # data: codvbgf # Linf K t0 # 104.7120 0.4260 0.2176 # residual sum-of-squares: 4109723 # # Number of iterations to convergence: 7 # Achieved convergence tolerance: 2.169e-06 # all years Nonlinear regression model # model: length ~ vb1(avgage, Linf, K, t0) # data: codvbgf # Linf K t0 # 112.4694 0.3912 0.2097 # residual sum-of-squares: 7354906 # # Number of iterations to convergence: 11 # Achieved convergence tolerance: 9.964e-06 hadvbgf <- svlenage %>% filter(species == "Haddock") #group_by(era) %>% hadfit <- nls(length~vb1(avgage, Linf, K, t0), data = hadvbgf, start = vbStarts(length~avgage,data = hadvbgf)) plot(length~avgage,data=hadvbgf,pch=19,col=rgb(0,0,0,1/5)) curve(vb1(x,Linf=coef(hadfit)),from=0,to=20,col="red",lwd=2,add=TRUE) # all years # Nonlinear regression model # model: length ~ vb1(avgage, Linf, K, t0) # data: hadvbgf # Linf K t0 # 77.0277 0.3601 -0.3546 # residual sum-of-squares: 15637500 # # Number of iterations to convergence: 11 # Achieved convergence tolerance: 4.187e-06 redvbgf <- svlenage %>% filter(species == "Redfish") #group_by(era) %>% redfit <- nls(length~vb1(avgage, Linf, K, t0), data = redvbgf, start = vbStarts(length~avgage,data = redvbgf)) plot(length~avgage,data=redvbgf,pch=19,col=rgb(0,0,0,1/5)) curve(vb1(x,Linf=coef(redfit)),from=0,to=40,col="red",lwd=2,add=TRUE) #all years # Nonlinear regression model # model: length ~ vb1(avgage, Linf, K, t0) # data: redvbgf # Linf K t0 # 40.8016 0.1165 -1.6688 # residual sum-of-squares: 3197990 # # Number of iterations to convergence: 13 # Achieved convergence tolerance: 8.316e-06 saithe1 <- svlenage %>% filter(species=="Saithe", era==1) saithe1fit <- nls(length~vb1(avgage, Linf, K, t0), data = saithe1, start = vbStarts(length~avgage,data = saithe1)) plot(length~avgage,data=saithe1,pch=19,col=rgb(0,0,0,1/5)) curve(vb1(x,Linf=coef(saithe1fit)),from=0,to=20,col="red",lwd=2,add=TRUE) #Nonlinear regression model # model: length ~ vb1(avgage, Linf, K, t0) # data: saithe1 # Linf K t0 # 388.18797 0.01975 -3.94798 # residual sum-of-squares: 5691213 # # Number of iterations to convergence: 5 # Achieved convergence tolerance: 5.511e-06 #plot len-age data and start values, some are bad svlenage_pl <- svlenage %>% group_by(species, era) %>% group_map(~vbStarts(length~avgage,data=cur_group(),plot=TRUE)) ### full time series fits with different start points svvbgf <- svlenage %>% dplyr::group_by(species) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .)))) spk <- svlenage %>% group_by(species) %>% group_keys() vballout <- data.frame() for(i in 1:length(svvbgf)) { if(is.list(svvbgf[[i]])) { sp <- cbind(as.data.frame(spk)[i,], t(coef(svvbgf[[i]]))) vballout <- rbind(vballout, sp) } } svvbgf.1 <- svlenage %>% dplyr::group_by(species) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .,methLinf="oldAge")))) vballout.1 <- data.frame() for(i in 1:length(svvbgf.1)) { if(is.list(svvbgf.1[[i]])) { sp <- cbind(as.data.frame(spk)[i,], t(coef(svvbgf.1[[i]]))) vballout.1 <- rbind(vballout.1, sp) } } svvbgf.2 <- svlenage %>% dplyr::group_by(species) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .,methLinf="longFish")))) vballout.2 <- data.frame() for(i in 1:length(svvbgf.2)) { if(is.list(svvbgf.2[[i]])) { sp <- cbind(as.data.frame(spk)[i,], t(coef(svvbgf.2[[i]]))) vballout.2 <- rbind(vballout.2, sp) } } ################## by era gk <- svlenage %>% group_by(species, era) %>% group_keys() svvbgf3 <- svlenage %>% dplyr::group_by(species, era) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .)))) vbout <- data.frame() for(i in 1:length(svvbgf3)) { if(is.list(svvbgf3[[i]])) { spe <- cbind(as.data.frame(gk)[i,], t(coef(svvbgf3[[i]]))) vbout <- rbind(vbout, spe) } } # try different start points to see if we can get more species fit svvbgf3.1 <- svlenage %>% dplyr::group_by(species, era) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .,methLinf="oldAge")))) vbout.1 <- data.frame() for(i in 1:length(svvbgf3.1)) { if(is.list(svvbgf3.1[[i]])) { spe <- cbind(as.data.frame(gk)[i,], t(coef(svvbgf3.1[[i]]))) vbout.1 <- rbind(vbout.1, spe) } } # try different start points to see if we can get more species fit # this one goes to 11! svvbgf3.2 <- svlenage %>% dplyr::group_by(species, era) %>% group_map(~try(nls(length~vb1(avgage, Linf, K, t0), data = ., start = vbStarts(length~avgage,data = .,methLinf="longFish")))) vbout.2 <- data.frame() for(i in 1:length(svvbgf3.2)) { if(is.list(svvbgf3.2[[i]])) { spe <- cbind(as.data.frame(gk)[i,], t(coef(svvbgf3.2[[i]]))) vbout.2 <- rbind(vbout.2, spe) } }
Generating length composition input data requires mapping numbers at length in the atlantisom 1 cm bin length output to our selected length bins, then summing the numbers at length in those bins for all species. The input required is a 4d array, indexed by survey, species, year, and sizebin. So each survey can be a separate csv with #species at the top of year rows and sizebin columns with n at length the values, scrolling down for each species.
Arrange lengths into appropriate bins:
# rearrange into length bins and write length comp data csvs # modbins from sizebins above, differ by species so need modbin.min and modbin.max # pcumsum is an FSA function modbins <- sizebins %>% rename(species = `commentField-notused`) %>% pivot_longer(!species, names_to = "sizebin", values_to = "binwidth") %>% group_by(species) %>% arrange(sizebin) %>% mutate(modbin.min = pcumsum(binwidth), modbin.max = cumsum(binwidth)) %>% arrange(species, sizebin) #multiple surveys named in list object, use as filename for(s in names(len_comp_data)){ #arrange into long format: year, Species, agecl, length svlenbin <- len_comp_data[[s]][[1]] %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, agecl, year, upper.bins, atoutput) %>% dplyr::left_join(modbins) %>% dplyr::filter(modbin.min < upper.bins & upper.bins <= modbin.max) %>% dplyr::group_by(species, year, sizebin) %>% dplyr::summarise(sumlen = sum(atoutput)) %>% tidyr::spread(sizebin, sumlen) %>% write_csv(paste0(o.name,"observation_lengths_NOBA_",s,".csv")) }
The first pass through with the current bins results in very few lengths in the smallest bin for most species. This suggests we need to change the binwidths and possibly the number of bins to optimize length fitting. Perhaps a larger first bin or don't start it at 0?
Consider a function to do this using species max size, the vonB k and the desired number of bins for more consistency and to easily change Nsizebins.
This function is below, but with only 5 bins I don't like how it works, allocating more smaller bins to the smallest sizes that we don't observe and aggregating many lengths into one large bin. Therefore the simplest approach is to have equal sized bins (max observed length/desired number of bins). We will try this for the length comps.
# see e.g., https://www.sciencedirect.com/science/article/abs/pii/S0165783619303297 # estimated optimal bin size for VBGF estimation is 0.23 x Lmax ^0.6 # see Ogle citing Schnute and Fornier on K as a growth increment # https://derekogle.com/NCNRS349/modules/Growth/BKG # K exponential rate of approach to asymptotic size # exp(-K) fraction by which growth increment is multiplied each year # at what length is increment*exp(-K) small? group those lengths into bigger bins # pull observed min and max size from data to optimize start and end bins minmaxlen <- list() # read in saved vonB pars--see if seasons are different and will need a replacement if K wasn't estimated #era <- 1 vonBK <- list() minmaxvonBK <- list() for(s in names(len_comp_data)){ minmaxlendf <- len_comp_data[[s]][[1]] %>% dplyr::group_by(species) %>% dplyr::summarise(minlen = min(lower.bins, na.rm = TRUE), maxlen = max(upper.bins, na.rm = TRUE), range = maxlen-minlen) for(era in 1:3){ vonBKer <- read.csv(paste0(o.name,"growth_species_NOBA_era",era,"_",s,".csv")) %>% rename(par = X) %>% pivot_longer(!par, names_to = "species") %>% pivot_wider(names_from = par, names_prefix = era, values_from = value) %>% select(species, paste0(era,"K")) ifelse(era==1, vonBKdf <- vonBKer, vonBKdf <- left_join(vonBKdf, vonBKer)) } minmaxvonBKdf <- left_join(minmaxlendf,vonBKdf) minmaxlen[[s]] <- minmaxlendf vonBK[[s]] <- vonBKdf minmaxvonBK[[s]] <- minmaxvonBKdf } # note! # Green_halibut and Saithe have maxlen at 200 in at least one survey, # may need higher maxlength when running atlantisom comps wrapper, fix survey config files estbinwidths <- function(Lmax, vonBk, Nsizebins, thresh = 0.05){ #optimal bin size is lower end binopt <- 0.23*Lmax^0.6 #range gives initial default binwidth, all equal (range/Nbins) #or should it be max/nbins? #can use if no K estimated bindef <- max(1,ceiling(Lmax/Nsizebins)) #minmaxvonBK$BTS_fall_allbox_effic1 %>% group_by(species) %>% mutate(bindef = ceiling(maxlen/5)) #vonBk tells you how many fine bins for fast growth #low vonBK is doesn't asympote, divide length range equally into nbins? #high vonBK asympotes quickly, # threshold cumprod of 0.05 or 0.1 to define last bin? tiny growth increment # use binopt up to this threshold? # plot(cumprod(rep(1, 160)*exp(-0.0386)), ylim=c(0,1), xlim=c(0,180)) #saithe # lines(cumprod(rep(1, 20)*exp(-1.4)), ylim=c(0,1), xlim=c(0,180)) #capelin # lines(cumprod(rep(1, 40)*exp(-0.163)), ylim=c(0,1), xlim=c(0,180)) #redfish # lines(cumprod(rep(1, 130)*exp(-0.344)), ylim=c(0,1), xlim=c(0,180)) #cod lenthresh <- max(which(cumprod(rep(1, Lmax)*exp(-vonBk)) > thresh)) #gives 1 cm increments of fast growth #put all lengths above this into one big bin, binwidth.max <- Lmax-lenthresh #divide lengths below this by thresholds? for min 5 bins, use .5, .3, .2, .1, .05? # or divide equally into nbins-1 bins? equal may be best # hightest growth bin will be very small and we won't have fish in it # dont go smaller than binopt binwidth.gro <- max(ceiling(binopt), ceiling(lenthresh/(Nsizebins-1))) #aggregate bins in multiples of optimal for bigger fish/slower growth to get Nsizebins ifelse(binwidth.max > bindef & binwidth.max < 0.5*Lmax, binwidths <- c(rep(binwidth.gro, Nsizebins-1), binwidth.max), binwidths <- rep(bindef, Nsizebins)) return(binwidths) } equalbinwidths <- function(Lmax, Nsizebins){ bindef <- max(1,ceiling(Lmax/Nsizebins)) binwidths <- rep(bindef, Nsizebins) return(binwidths) } # for fixed bins across surveys find min and max observed lengths maxLrange <- bind_rows(minmaxvonBK) %>% group_by(species) %>% summarise(minminL = min(minlen), maxmaxL = max(maxlen)) Nsizebins <- 5 newbins <- maxLrange %>% group_by(species) %>% group_modify(~broom::tidy(equalbinwidths(.$maxmaxL, Nsizebins))) %>% mutate(lb = paste0("sizebin",seq(1:Nsizebins))) %>% pivot_wider(names_from = lb, values_from = x) sizebins <- as.data.frame(newbins) %>% rename('commentField-notused' = species) %>% relocate('commentField-notused', .after = last_col()) write_csv(sizebins, paste0(o.name,"length_sizebins_NOBA.csv"))
Reran the lendat code block above to make the survey-specific length input files with the new equal sized bins.
Now write input files in Gavin's format to have surveys together in one file. Note that this is just a test. The folder has two different survey and two different fishery configurations; they are not independent. Also, fishery catch is not currently separated into fleets. We wouldn't use all of these in an input file. Also add fishery length comps:
modbins <- sizebins %>% rename(species = `commentField-notused`) %>% pivot_longer(!species, names_to = "sizebin", values_to = "binwidth") %>% group_by(species) %>% arrange(sizebin) %>% mutate(modbin.min = pcumsum(binwidth), modbin.max = cumsum(binwidth)) %>% arrange(species, sizebin) allsvlenbin <- tibble() #multiple surveys named in list object, add to file for(s in names(len_comp_data)){ #arrange into long format: year, Species, agecl, length svlenbin <- len_comp_data[[s]][[1]] %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, agecl, year, upper.bins, atoutput) %>% dplyr::left_join(modbins) %>% dplyr::filter(modbin.min < upper.bins & upper.bins <= modbin.max) %>% dplyr::group_by(species, year, sizebin) %>% dplyr::summarise(sumlen = sum(atoutput)) %>% tidyr::spread(sizebin, sumlen) %>% ungroup() %>% dplyr::mutate(inpN = rowSums(.[,-(1:2)], na.rm = TRUE)) %>% dplyr::mutate(type = 0) %>% dplyr::mutate(survey = s) %>% dplyr::mutate_at(vars(contains("sizebin")), ~./inpN) %>% dplyr::select(survey, year, species, type, inpN, everything()) allsvlenbin <- bind_rows(allsvlenbin, svlenbin) } write_csv(allsvlenbin, paste0(o.name,"observation_lengths_NOBA_allsurvs.csv")) #requires read_savedfisheries function below, add to atlantisom catchlen_ss <- read_savedfisheries(d.name, "catchLen") allfishlenbin <- tibble() for(f in names(catchlen_ss)){ #arrange into long format: year, Species, agecl, length fishlenbin <- catchlen_ss[[f]][[1]] %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, agecl, year, upper.bins, atoutput) %>% dplyr::left_join(modbins) %>% dplyr::filter(modbin.min < upper.bins & upper.bins <= modbin.max) %>% dplyr::group_by(species, year, sizebin) %>% dplyr::summarise(sumlen = sum(atoutput)) %>% tidyr::spread(sizebin, sumlen) %>% ungroup() %>% dplyr::mutate(inpN = rowSums(.[,-(1:2)], na.rm = TRUE)) %>% dplyr::mutate(type = 0) %>% dplyr::mutate(fishery = f, area = 1) %>% dplyr::mutate_at(vars(contains("sizebin")), ~./inpN) %>% dplyr::select(fishery, area, year, species, type, inpN, everything()) %>% dplyr::arrange(fishery, area, species, year) allfishlenbin <- bind_rows(allfishlenbin, fishlenbin) } write_csv(allfishlenbin, paste0(o.name,"observation_lengths_NOBA_allfisheries.csv"))
Stomach weights are needed as input to hydra. At present each species has a stomach weight for each size bin that is repeated for each year. We have total consumed weight for each predator age class at each timestep in the detailed diet atlantis output, so would need to map age classes to length bins, sum consumed weight by length bin and divide by total numbers in that length bin to get stomach weight.
What we have saved so far is a diet comp in proportion that has had bias and observation error added to proportions. However, we need the stomach weight portion of the equation, which has the survey timing/area/efficiency/selectivity applied to true consumed weight. The hydra input is mean daily stomach weight in grams. From the Atlantis manual, "DetailedDietCheck.txt returns the total consumed biomass since the last output given for each cell (box and layer) of each age group of each functional group." So daily per capita consumed biomass is this output, summed over prey for total consumption, divided by the numbers from the same survey design, divided by the output step omlist_ss$runpar$outputstep
.
# apply surveys to true total consumption (snipped from atlantosom om_diet.R and om_comps.R) detaileddiet <- readRDS(file.path(d.name, paste0(scenario.name, "detaileddiet.rds"))) usersurvey <- c(here("simulated-data/config/mssurvey_spring.R"), here("simulated-data/config/mssurvey_fall.R"), here("simulated-data/config/mssurvey_spring_01.R"), here("simulated-data/config/mssurvey_fall_01.R")) omlist_ss <- readRDS(file.path(d.name, paste0(scenario.name, "omlist_ss.rds"))) source(here("simulated-data/config/omdimensions.R")) survObsStomWt <- list() for (s in usersurvey) { source(s, local = TRUE) # survtime doesn't match units of time.days in detaileddiet survtime <- survey_sample_full*omlist_ss$runpar$outputstep # apply survey design to detailed diet survey_cons <- create_survey_diet(dat = detaileddiet, time = survtime, species = survspp, boxes = survboxes, effic = surveffic, selex = survselex) # get numbers at ageclass for same survey design # note different time units! survey_N <- atlantisom::create_survey(dat = omlist_ss$truenums_ss, time = survey_sample_full, species = survspp, boxes = survboxes, effic = surveffic, selex = survselex.agecl) # convert survey N times to cons times in days survey_N$time <- survey_N$time*omlist_ss$runpar$outputstep # get rid of polygon survey_totN <- survey_N %>% group_by(species, agecl, time) %>% summarise(totN = sum(atoutput)) %>% ungroup() # sum over prey to get total consumption in t, divide by N and timestep, convert to g survey_totcons <- survey_cons %>% group_by(species, agecl, time) %>% summarise(totcons = sum(atoutput)) %>% ungroup() %>% left_join(survey_totN) %>% mutate(percap_cons = totcons/totN) %>% #mutate(daily_percap_g = percap_cons/omlist_ss$runpar$outputstep*1000000) mutate(daily_percap_g = percap_cons*1000000) #try assuming cons is snapshot not cumulative since last timestep #save survey consumption, takes a long time to generate with lots of reps/species #if(save){ saveRDS(survey_cons, file.path(d.name, paste0(scenario.name, "_", survey.name, "surveycons.rds"))) #} survObsStomWt[[survey.name]] <- survey_cons } # map to length bins using surey numbers at agecl in length bins # #multiple surveys named in list object, use as filename # for(s in names(all_diets)){ # #arrange into wide format: year, Species1, Species2 ... and write csv # svdiet <- all_diets[[s]][[1]] %>% # #calc_timestep2time(d.name, ., run.prm.file) %>% #this gives an arbitrary start year of 1970 # dplyr::mutate(time = time.days/NOBAom_ms$runpar$toutinc) %>% # dplyr::left_join(., trueBagecl) %>% # dplyr::mutate(dietBwt = dietSamp*trueBatage) %>% # dplyr::group_by(species,time,prey,trueB) %>% # dplyr::summarise(totdietBwt = sum(dietBwt)) %>% # dplyr::mutate(totdietComp = totdietBwt/trueB) %>% # dplyr::mutate(year = ceiling(time/stepperyr)) %>% # dplyr::ungroup() %>% # dplyr::select(year, species, prey, totdietComp) %>% # write_csv(paste0(o.name,"diet_",s,".csv")) # }
We also plan to fit to time series of prey proportions for each predator size class. Gavin suggests an input format similar to those for survey lengths. Diet output from surveys is by predator agecl
# same file as for MSSPM, diet proportion by agecl all_diets <- read_savedsurvs(d.name, 'survDiet') # get size bins for predators sizebins <- read_csv(paste0(o.name,"length_sizebins_NOBA.csv")) modbins <- sizebins %>% rename(species = `commentField-notused`) %>% pivot_longer(!species, names_to = "sizebin", values_to = "binwidth") %>% group_by(species) %>% arrange(sizebin) %>% mutate(modbin.min = pcumsum(binwidth), modbin.max = cumsum(binwidth)) %>% arrange(species, sizebin) # get Dirichlet alphamults from config files svcon <- list.files(path=here("/simulated-data/config/"), pattern = "*survey*", full.names = TRUE) svalphamultlook <- tibble() for(c in 1:length(svcon)){ source(svcon[c]) svalp <- tibble( surv_alphamult_n = alphamult, survey = survey.name ) svalphamultlook <- bind_rows(svalphamultlook, svalp) } # diet is by predator agecl, need diet by predator sizebin # map predator agecl to lengthbin using length_comp_data from same survey allsvdietprop <- tibble() #multiple surveys named in list object for(s in names(all_diets)){ #arrange into lookup for agecl to lengthbin in numbers of fish svagelenbin <- len_comp_data[[s]][[1]] %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, agecl, year, upper.bins, atoutput) %>% dplyr::left_join(modbins) %>% dplyr::filter(modbin.min < upper.bins & upper.bins <= modbin.max) %>% dplyr::group_by(species, year, agecl, sizebin) %>% dplyr::summarise(sumlen = sum(atoutput)) %>% dplyr::group_by(species, year, sizebin) %>% dplyr::mutate(propage = sumlen/sum(sumlen)) #proportion of each agecl contributing to lengthbin svdietprop <- all_diets[[s]][[1]] %>% dplyr::mutate(year = ceiling(time.days/365)) %>% # join with agecl to length lookup dplyr::select(species, agecl, year, prey, dietSamp) %>% dplyr::left_join(svagelenbin) %>% dplyr::mutate(dietpropage = dietSamp*propage) %>% #reweight diets for lengthbins dplyr::group_by(species, year, sizebin, prey) %>% dplyr::summarise(dietsize = sum(dietpropage)) %>% dplyr::filter(prey %in% unique(modbins$species)) %>% #drops predators that don't eat our modeled species tidyr::spread(prey, dietsize) %>% ungroup() %>% dplyr::filter(!is.na(sizebin)) %>% dplyr::mutate(allotherprey = 1-rowSums(.[,-(1:3)], na.rm = TRUE)) %>% dplyr::mutate(survey = s) %>% dplyr::left_join(svalphamultlook) %>% #what is effective sample size for Dirichlet? dplyr::rename(inpN = surv_alphamult_n) %>% #use this input for now dplyr::select(survey, year, species, sizebin, inpN, everything()) allsvdietprop <- bind_rows(allsvdietprop, svdietprop) } write_csv(allsvdietprop, paste0(o.name,"observation_diets_NOBA_allsurvs.csv")) #
Do we need to fit recruitment? May just need average recriutment starting values, which I can get from atlantis. Maturity parameters are needed for SSB calculation and SSB-based recruitment. If we fit only to average recruitment with annual deviations, and don't need SSB, maturity and recruitment inputs could be dummy parameters for now.
# need 2-par maturity curve based on length, atlantis maturity ogive based on age group # or just translate age at 50% maturity to length and call it good?
Survey and catch datasets for hydra from NOBA (same methods as MSSPM). Make this a function that takes output directory name and output names as arguments.
o.name <- here("simulated-data/msinput/hydra/") stepperyr <- if(NOBAom_ms$runpar$outputstepunit=="days") 365/NOBAom_ms$runpar$toutinc #read in survey biomass data survObsBiom <- read_savedsurvs(d.name, 'survB') #reads in all surveys #multiple surveys named in list object, use as filename for(s in names(survObsBiom)){ #arrange into wide format: year, Species1, Species2 ... and write csv svbio <- survObsBiom[[s]][[1]] %>% #calc_timestep2time(d.name, ., run.prm.file) %>% #this gives an arbitrary start year of 1970 dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, year, atoutput) %>% tidyr::spread(species, atoutput) %>% write_csv(paste0(o.name,"observation_biomass_NOBA_",s,".csv")) } #read in catch data read_savedfisheries <- function(dir, type){ datlook <- data.frame(dattype = c('Catch', 'catchAge', 'catchLen', 'catchWtage', 'catchAnnAge', 'catchAnnWtage'), pattern = c("*fishCatch.rds", "*fishObsAgeComp.rds", "*fishObsLenComp.rds", "*fishObsWtAtAge.rds", "*fishObsFullAgeComp.rds", "*fishObsFullWtAtAge.rds")) fisheries <- list.files(path=dir, pattern = as.character(datlook$pattern[datlook$dattype %in% type]), full.names = TRUE) fishery.name <- str_match(fisheries, paste0(scenario.name,"_\\s*(.*?)\\s",datlook$pattern[datlook$dattype==type]))[,2] dat <- lapply(fisheries, readRDS) names(dat) <- fishery.name return(dat) } catchbio_ss <- read_savedfisheries(d.name, 'Catch') for(c in names(catchbio_ss)){ #arrange into wide format: year, Species1, Species2 ... and write csv catchbio <- catchbio_ss[[c]][[1]] %>% dplyr::mutate(year = time/365) %>% dplyr::select(species, year, atoutput) %>% tidyr::spread(species, atoutput) %>% write_csv(paste0(o.name,"observation_catch_NOBA_",c,".csv")) } #write readme for metadata meta <- c("Simulated data from Norweigan-Barents Atlantis Model by Hansen et al. \n\n", paste("Atlantis outputs stored in", d.name, '\n\n'), paste("Biomass output for surveys", names(survObsBiom), '\n\n'), paste("Fishery catch output for fisheries", names(catchbio_ss), '\n\n'), "Biomass and catch output units are annual tons.\n\n", "Survey specifications:\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_fall.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_spring.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_fall_01.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/mssurvey_spring_01.R")), "\n\n", "Fishery specifications:\n", "----------------------\n", read_file(here("/simulated-data/config/msfishery.R")), "\n", "----------------------\n", read_file(here("/simulated-data/config/msfishery_01.R")), "\n" ) cat(as.character(meta), file=paste0(here("simulated-data/msinput/hydra/"),"README.txt"))
Survey and catch datasets in Gavin's suggested format (all surveys together, all fleets together). As above, note that this is just a test. The folder has two different survey and two different fishery configurations; they are not independent. Also, fishery catch is not currently separated into fleets. We wouldn't use all of these in an input file.
#read in survey biomass data survObsBiom <- read_savedsurvs(d.name, 'survB') #reads in all surveys # get cvs from config files svcon <- list.files(path=here("/simulated-data/config/"), pattern = "*survey*", full.names = TRUE) fishcon <- list.files(path=here("/simulated-data/config/"), pattern = "*fishery*", full.names = TRUE) omlist_ss <- NOBAom_ms source(here("/simulated-data/config/omdimensions.R")) svcvlook <- tibble() for(c in 1:length(svcon)){ source(svcon[c]) surv_cv_n <- surv_cv %>% mutate(survey=survey.name) svcvlook <- bind_rows(svcvlook, surv_cv_n) } fcvlook <- tibble() for(c in 1:length(fishcon)){ source(fishcon[c]) fish_cv_n <- fish_cv %>% mutate(fishery=fishery.name) fcvlook <- bind_rows(fcvlook, fish_cv_n) } allsvbio <- tibble() #multiple surveys named in list object, use as filename for(s in names(survObsBiom)){ #arrange into wide format: year, Species1, Species2 ... and write csv svbio <- survObsBiom[[s]][[1]] %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(species, year, atoutput) %>% dplyr::rename(biomass = atoutput) %>% dplyr::mutate(survey = s) %>% dplyr::left_join(svcvlook) %>% dplyr::select(survey, year, species, everything()) %>% dplyr::arrange(survey, species, year) allsvbio <- bind_rows(allsvbio, svbio) } write_csv(allsvbio, paste0(o.name,"observation_biomass_NOBA_allsurvs.csv")) allcatch <- tibble() for(f in names(catchbio_ss)){ catchbio <- catchbio_ss[[f]][[1]] %>% dplyr::mutate(year = time/365) %>% dplyr::select(species, year, atoutput) %>% dplyr::rename(catch = atoutput) %>% dplyr::mutate(fishery = f) %>% dplyr::mutate(area = 1) %>% dplyr::left_join(fcvlook) %>% dplyr::select(fishery, area, year, species, everything()) %>% dplyr::arrange(fishery, species, year) allcatch <- bind_rows(allcatch, catchbio) } write_csv(allcatch, paste0(o.name,"observation_catch_NOBA_allfisheries.csv"))
Output temperature data from surveys using "Temp" variable in .nc file and the atlantisom::load_nc_physics
function. A survey (timestep, polygons, possibly observation error) can then be applied to this output.
We can get surface and bottom temp from Atlantis using the layers from each polygon, with layer 0 being the bottom and the maximum layer in the polygon being the surface (according to the Atlantis wiki/Atlantis output files/Atlantis NetCDF structure.
# taken from existing load_nc to explore # file.nc <- file.path(d.name, "outputnordic_runresults_01.nc") # # # Load ATLANTIS output! # at_out <- RNetCDF::open.nc(con = file.nc) # # # Get info from netcdf file! (Filestructure and all variable names) # var_names_ncdf <- sapply(seq_len(RNetCDF::file.inq.nc(at_out)$nvars - 1), # function(x) RNetCDF::var.inq.nc(at_out, x)$name) # n_timesteps <- RNetCDF::dim.inq.nc(at_out, 0)$length # n_boxes <- RNetCDF::dim.inq.nc(at_out, 1)$length # n_layers <- RNetCDF::dim.inq.nc(at_out, 2)$length # # RNetCDF::close.nc(at_out) #'@physic_variables A character value spefifying which variables #' should be loaded. Only one variable can be loaded at a time. #' Currently, only the following variables are allowed: #' "salt", "NO3", "NH3", "Temp", "Oxygen", "Si", "Det_Si", "DON", #' "Chl_a", "Denitrifiction", "Nitrification". truetemp <- atlantisom::load_nc_physics(dir = d.name, file_nc = "outputnordic_runresults_01.nc", physic_variables = "Temp", aggregate_layers = FALSE, bboxes = atlantisom::get_boundary(NOBAom_ms$boxpars)) #if(verbose) message("Temperature read in.") # this averages all depth layers, don't want this # truetempagg <- atlantisom::load_nc_physics(dir = d.name, # file_nc = "outputnordic_runresults_01.nc", # physic_variables = "Temp", # aggregate_layers = TRUE, # bboxes = atlantisom::get_boundary(NOBAom_ms$boxpars)) truebottomtemp <- truetemp %>% filter(layer == 0) # according to Atlantis wiki, layer 0 is bottom truesurfacetemp <- truetemp %>% filter(layer<7) %>% #layer 7 is always identical temp to layer 0, may be the added sediment layer? group_by(polygon, time) %>% filter(layer==max(layer)) %>% ungroup surveyenv <- function(trueenv, survboxes, survtimes) { surveydat <- trueenv %>% dplyr::filter(polygon %in% survboxes) %>% dplyr::filter(time %in% survtimes) return(surveydat) } omlist_ss <- NOBAom_ms source(here("simulated-data/config/omdimensions.R")) #need to weight by polygon area to get mean annual temp time series for each survey boxarea <- map_dfr(NOBAom_ms$boxpars$boxes, "area") %>% pivot_longer(everything(), names_to = "polygon", values_to = "area") %>% mutate(polygon = as.integer(polygon)) %>% mutate(proparea = area/sum(area)) source(here("simulated-data/config/mssurvey_fall.R")) fallcensus_bottomtemp <- surveyenv(truebottomtemp, survboxes = survboxes, survtimes = survtime) fallcensus_btempindex <- left_join(fallcensus_bottomtemp, boxarea) %>% dplyr::group_by(time) %>% dplyr::summarise(meantemp = weighted.mean(atoutput, proparea)) %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(year, meantemp) %>% write_csv(paste0(o.name,"observation_temperature_NOBA_",survey.name,".csv")) source(here("simulated-data/config/mssurvey_spring.R")) springcensus_bottomtemp <- surveyenv(truebottomtemp, survboxes = survboxes, survtimes = survtime) springcensus_btempindex <- left_join(springcensus_bottomtemp, boxarea) %>% dplyr::group_by(time) %>% dplyr::summarise(meantemp = weighted.mean(atoutput, proparea)) %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(year, meantemp) %>% write_csv(paste0(o.name,"observation_temperature_NOBA_",survey.name,".csv")) source(here("simulated-data/config/mssurvey_fall_01.R")) fallnearbox_bottomtemp <- surveyenv(truebottomtemp, survboxes = survboxes, survtimes = survtime) fallnearbox_btempindex <- left_join(fallnearbox_bottomtemp, boxarea) %>% dplyr::group_by(time) %>% dplyr::summarise(meantemp = weighted.mean(atoutput, proparea)) %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(year, meantemp) %>% write_csv(paste0(o.name,"observation_temperature_NOBA_",survey.name,".csv")) source(here("simulated-data/config/mssurvey_spring_01.R")) springnearbox_bottomtemp <- surveyenv(truebottomtemp, survboxes = survboxes, survtimes = survtime) springnearbox_btempindex <- left_join(springnearbox_bottomtemp, boxarea) %>% dplyr::group_by(time) %>% dplyr::summarise(meantemp = weighted.mean(atoutput, proparea)) %>% dplyr::mutate(year = ceiling(time/stepperyr)) %>% dplyr::select(year, meantemp) %>% write_csv(paste0(o.name,"observation_temperature_NOBA_",survey.name,".csv"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.