\newpage

knitr::opts_chunk$set(echo = TRUE)
require(devtools)
require(ggplot2)
require(gridExtra)
library(pander)
devtools::install_github("rooperc4/GroundfishCondition")
library("GroundfishCondition")
options(width=50, width.cutoff=50, digits = 3) 

PURPOSE

The purpose of this package is to provide a set of functions and a template for R Markdown documents for the computation of groundfish condition in Alaska for the ecosystem considerations chapter of the SAFE. The functions can also be used to calculate L-W residuals for other purposes.

DATA

The example data sets given here are for the EBS, AI and GOA through the 2017 bottom trawl survey years. However, other surveys that collect similar data (e.g. EcoFOCI surveys, RPA surveys and MACE-AT surveys) can use these functions as well. The r markdown script was prepared by E. Siddon and the functions and package compiled by C. Rooper. A more detailed description of the analyses and ecological relevance can be found in a draft manuscript (Boldt, Rooper and Hoff, in prep, that should be completed in late 2018).

The example data set included in this package is called lwdata and is specimen data (length and weight specimens) for all species where this data was collected on RACE bottom trawl surveys. The data was initially extracted from RACEBASE tables (including HAUL, SPECIMEN, SPECIES and STRATA tables). Other sources of data work fine as well, the key points are to have individual lengths and weights, a species identifier, a strata identifier (if you want some sort of by-area calculations) and a year. Here is what the RACEBASE data look like.

data("lwdata")
#create a year variable from the cruise data
lwdata["YEAR"]<-round(lwdata["CRUISE"]/100,digits=1)
pander::pandoc.table(lwdata[1:6,])

FUNCTIONS

The key function for this analysis is the calculation of residuals from a log-log relationship between length and weight the lw.resids function inputs the length and weight data for individual fish and estimates the residuals.

lw.resids<-function(length,weight){
    loglength<-log(length)
    logwt<-log(weight)
    lw.res<-lm(logwt~loglength)
    lw.res<-lw.res$residuals
return(lw.res)}

There is also a function to weight the residuals with the catch at the trawl survey station and there is some additional functionality that allows you to automatically detect and remove outliers.

EXAMPLE

As an example of calculating the condition index, we take Pacific Cod in the eastern Bering Sea. The code is designed to loop through a number of species (e.g. see the EBS_Groundfish_COndition.Rmd file which computes the index for a group of species used in the Ecosystem Considerations SAFE Chapter). For this example, we are doing only a single species, but have left the architecture for multiple species intact. Also, please note that some of the steps such as the STRATUM definitions are specific to the RACEBASE EBS data and may not be needed for other applications.

EBS.species<-21720 #21720 is the species code for Pacific cod in RACEBASE
EBS.speciesnames<-"Pacific cod" #This is a variable used only in the graphical output
EBS.lwdata<-subset(lwdata,lwdata["REGION"]=="BS") #Subset the Length-weight data collected in the Bering Sea only
EBS.lwdata["STRATUM"]<-round(EBS.lwdata["STRATUM"],digits=-1) #Kluge to combine strata 61, 62 to a single strata and strata 31, 32 to a single strata and strata 41, 42, and 43 into a single stata. See text for details.
EBS.lwdata<-subset(EBS.lwdata,EBS.lwdata["STRATUM"]<=60&EBS.lwdata["STRATUM"]>0)#Remove the northern Bering Sea strata.
p<-substr(EBS.lwdata[,"STATIONID"],2,2) #Identify corner stations
EBS.lwdata<-subset(EBS.lwdata,p=="-" ) #Remove corner stations

The plotting by year is done in ggplot2 using gridExtra. This code can be modified as needed, but basically the mean L-W residuals (and variances) are calculated and put into a bar plot with year as the x-axis and mean residual on the y-axis. We have made the plots into a list object (myplot) so that they can easily be arranged onto a gridded figure.

myplot<-list() #Make the empty list
#myplot[]
lwdata_by_year<-array(dim=c(0,6))
colnames(lwdata_by_year)<-c("species","yrs","ymeans","yn","ysd","yse")
for(i in 1:length(EBS.species)){ #set up the loop to loop through species
    tempdata<-subset(EBS.lwdata,EBS.lwdata$SPECIES_CODE==EBS.species[i]) #subset the data to the species of interest
    tempdata["residuals"]<-lw.resids(tempdata$LENGTH,tempdata$WEIGHT,outlier.rm=TRUE) # Use the lw.resids function to calculate the residuals
    tempdata["residuals.wt"]<-weighted_resids(tempdata$YEAR,tempdata$residuals,tempdata$CATCH)
    tempdata<-subset(tempdata,is.na(tempdata$residuals.wt)==FALSE) 
    yrs=sort(unique(tempdata$YEAR)) #Sort by year
    ymeans=tapply(tempdata$residuals.wt,tempdata$YEAR,mean) #Calculate mean by year
    yn=tapply(tempdata$residuals.wt,tempdata$YEAR,length) #Count the number of observations by year
    ysd=tapply(tempdata$residuals.wt,tempdata$YEAR,sd) #Calculate the sd of the mean by year
    yse=ysd/sqrt(yn) #Calculate the standard error
    data.summary<-data.frame(EBS.species[i],yrs,ymeans,yn,ysd,yse) #Put the mean, SE and year into a data frame for plotting
lwdata_by_year<-rbind(lwdata_by_year,data.summary)  
p<-ggplot(data.summary, aes(x = yrs, y = ymeans),cex=2) +  
  geom_bar(position = position_dodge(), stat="identity", fill="cornflowerblue",col="black") + 
  geom_errorbar(aes(ymin=ymeans-yse, ymax=ymeans+yse),width=0.30) +
 xlim(1996.5, 2017.5)+
 ggtitle(paste(EBS.speciesnames[i])) + 
    geom_hline(yintercept=0, color="black")+
  theme_bw() +
  theme(panel.grid.major = element_blank())+
    theme(axis.text.x = element_text(size=8))+
theme(axis.text.y = element_text(size=8))+
theme(axis.title.x = element_text(size=8))+
theme(axis.title.y = element_text(size=8))+
labs(title = paste(EBS.speciesnames[i]), y = "Length-weight residual", x = "Year")
#print(p)
pltName <- paste( EBS.speciesnames[i],"plot", sep = '' ) #Name the plot
myplot[[pltName]]<-p} #Add the plot to the list and loop

This line of code does the arranging of the plots on a grid. In this case we have a single species (Pacific cod), so the number of columns (ncol) is set to 1. For the Ecosystem Considerations contributions there are usually 2 columns (ncol=2). It also outputs the figure as a .png to the working directory and outputs the data to a .csv file.

png("EBSbyyear.png",width=6,height=7,units="in",res=300)
grid.arrange(grobs=myplot,ncol=1)
dev.off()
grid.arrange(grobs=myplot,ncol=1)
write.csv(lwdata_by_year,"lwdata_by_year.csv",row.names=FALSE)

In this next bit of code, the same procedures are followed for calculating the residuals and producing plots. This time the plots are by strata and year. Plots and data are output at the end.

lwdata_by_strata<-array(dim=c(0,7))
colnames(lwdata_by_strata)<-c("species","strata","yrs","ymeans","yn","ysd","yse")
#By stratum graphs
for(i in 1:length(EBS.species)){
    tempdata<-subset(EBS.lwdata,EBS.lwdata$SPECIES_CODE==EBS.species[i])
    tempdata["residuals"]<-lw.resids(tempdata$LENGTH,tempdata$WEIGHT, outlier.rm=TRUE)
    tempdata["residuals.wt"]<-weighted_resids(tempdata$YEAR,tempdata$residuals,tempdata$CATCH)
    tempdata<-subset(tempdata,is.na(tempdata$residuals.wt)==FALSE) 
ymeans=aggregate(tempdata$residuals.wt,by=list(tempdata$YEAR,tempdata$STRATUM),mean)
ysd=aggregate(tempdata$residuals.wt,by=list(tempdata$YEAR,tempdata$STRATUM),sd)
yn=aggregate(tempdata$residuals.wt,by=list(tempdata$YEAR,tempdata$STRATUM),length)
yse=ysd$x/sqrt(yn$x)
data.summary<-data.frame(species=EBS.species[i],strata=ymeans$Group.2,yrs=ymeans$Group.1,ymeans=ymeans$x,yn=yn$x,ysd=ysd$x,yse=yse)
lwdata_by_strata<-rbind(lwdata_by_strata,data.summary)

dat1 <- subset(data.summary,data.summary$ymeans>=0)
dat2 <- subset(data.summary,data.summary$ymeans< 0)
p2<-ggplot() +
    geom_bar(data = dat1, aes(x=yrs, y=ymeans, fill=factor(strata)),stat = "identity",col="black") +
    geom_bar(data = dat2, aes(x=yrs, y=ymeans, fill=factor(strata)),stat = "identity",col="black")+
 scale_fill_brewer(palette = "Spectral")+
    geom_hline(yintercept=0, color="black")+
 xlim(1996.5, 2017.5)+
theme_bw() +
    theme(axis.text.x = element_text(size=8))+
theme(axis.text.y = element_text(size=8))+
theme(axis.title.x = element_text(size=8))+
theme(axis.title.y = element_text(size=8))+
  theme(panel.grid.major = element_blank())+theme(legend.position="right")+
 labs(title = paste(EBS.speciesnames[i]), y = "Length-weight residual", x = "Year", fill = "Stratum")
#print(p2)
pltName <- paste( EBS.speciesnames[i],"plot", sep = '' )
myplot[[pltName]]<-p2}

png("bystrata.png",width=6,height=7,units="in",res=300)
grid.arrange(grobs=myplot,ncol=1)
dev.off()
grid.arrange(grobs=myplot,ncol=1)
write.csv(lwdata_by_strata,"lwdata_by_strata.csv",row.names=FALSE)

Here are the plots are by strata and year with individual plots for each strata (for J. Ianelli). Plots and data are output at the end.

lwdata_by_strata<-array(dim=c(0,7))
colnames(lwdata_by_strata)<-c("species","strata","yrs","ymeans","yn","ysd","yse")
myplot<-list() 
#By stratum graphs
for(i in 1:length(EBS.species)){
    tempdata<-subset(EBS.lwdata,EBS.lwdata$SPECIES_CODE==EBS.species[i])
    tempdata["residuals"]<-lw.resids(tempdata$LENGTH,tempdata$WEIGHT, outlier.rm=TRUE)
    tempdata["residuals.wt"]<-weighted_resids(tempdata$YEAR,tempdata$residuals,tempdata$CATCH)
    tempdata<-subset(tempdata,is.na(tempdata$residuals.wt)==FALSE) 
ymeans=aggregate(tempdata$residuals.wt,by=list(tempdata$YEAR,tempdata$STRATUM),mean)
ysd=aggregate(tempdata$residuals.wt,by=list(tempdata$YEAR,tempdata$STRATUM),sd)
yn=aggregate(tempdata$residuals.wt,by=list(tempdata$YEAR,tempdata$STRATUM),length)
yse=ysd$x/sqrt(yn$x)
data.summary<-data.frame(species=EBS.species[i],strata=ymeans$Group.2,yrs=ymeans$Group.1,ymeans=ymeans$x,yn=yn$x,ysd=ysd$x,yse=yse)
lwdata_by_strata<-rbind(lwdata_by_strata,data.summary)

ustrata<-unique(data.summary$strata)
for(j in 1:length(ustrata)){
  dat1<-subset(data.summary,data.summary$strata==ustrata[j])
  signymeans<-sign(dat1$ymeans)*(-.6)
p2<-ggplot(data=dat1,aes(x=yrs,y=ymeans,fill=factor(strata))) +
    geom_bar(position=position_dodge(),stat = "identity",col="black") +
#  geom_bar(position = position_dodge(), stat="identity", fill="cornflowerblue",col="black") + 
  geom_errorbar(aes(ymin=ymeans-yse, ymax=ymeans+yse),width=0.30)+
   geom_text(data=dat1,aes(x=yrs,y=sign(ymeans)*(abs(ymeans)+yse),label=yn),vjust=signymeans,size=1.5) +
  #    geom_bar(data = dat2, aes(x=yrs, y=ymeans, fill=factor(strata)),stat = "identity",col="black")+
 scale_fill_brewer(palette = "Spectral")+
    geom_hline(yintercept=0, color="black")+
 xlim(1996.5, 2017.5)+
theme_bw() +
    theme(axis.text.x = element_text(size=8))+
theme(axis.text.y = element_text(size=8))+
theme(axis.title.x = element_text(size=8))+
theme(axis.title.y = element_text(size=8))+
  theme(panel.grid.major = element_blank())+theme(legend.position="none")+
 labs(title = paste(EBS.speciesnames[i]," - stratum ",ustrata[j],sep=""), y = "Length-weight residual", x = "Year", fill = "Stratum")
pltName <- paste( EBS.speciesnames[i],ustrata[j],"plot", sep = '' )
myplot[[pltName]]<-p2}

png(paste(EBS.speciesnames[i],"bystrata.png",sep="_"),width=6,height=7,units="in",res=300)
grid.arrange(grobs=myplot,ncol=2)
dev.off()
grid.arrange(grobs=myplot,ncol=2)}

write.csv(lwdata_by_strata,"lwdata_by_strata.csv",row.names=FALSE)

GROUNDFISH CONDITION INDICES FOR ALASKA ESR

There are 3 r markdown scripts included in this package that can be used to produce the groundfish condition indices for the annual updates to the ecosystem contributions section of the SAFE documents. They are:

GOA_GroundfishCondition.rmd

EBS_GroundfishCondition.rmd

AI_GroundfishCondition.rmd

These rmarkdown scripts are based on E. Siddon's conversion of the original R script and will create a word document and associated figures. It should only be necessary to update the length-weight data from RACEBASE and then the r markdown text (to reflect any new trends or results).



rooperc4/GroundfishCondition documentation built on May 4, 2019, 1:22 p.m.