In delomast/fishCompTools: Tools to Assist With Fish Composition Estimates, Particularly Using Parental Based Tagging (PBT)

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(fishCompTools)

Introduction

This document will walk through how to estimate composition of a mixture of all ad-clipped fish by PBT group under the assumption that all fish in your sample belong to one of your PBT groups. In other words, the "true" number of fish that don't belong to one of your PBT groups is 0. This example uses GSI assignments, but if you do not calculate GSI assignments for fish, you can create a dummy variable that gives all fish (including PBT-assigned fish) the same GSI assignment.

We'll be assuming these fish are all ad-clipped, but a similar procedure can be carried out for ad-intact as well by adjust the appropriate options.

First, get some data

First, we will generate some data using the built in data-simulation function.

set.seed(130) #make generated data reproducible
# define the unknown "true" proportions at which fish in each
# PBT group assign to the different GSI groups
pbtGSImat <- matrix(c(.1, .8, .1,.8, .1, .1,.1, .1, .8), nrow = 3, ncol = 3, byrow = TRUE)

# now generate three strata from identical parameters
multStratData <- data.frame()
for(i in 1:3){
    tempData <- generatePBTGSIdata(sampRate = .2, censusSize = 3000, relSizePBTgroups = c(1,2,3), tagRates = c(.8, .85,.9), physTagRates = 0, true_clipped = 0, true_noclip_H = 1, true_wild = 0, relSizeGSIgroups = c(1,2,1), PBT_GSI_calls = pbtGSImat)
    trapData <- tempData[[1]]
    trapData$Strata <- i
    multStratData <- rbind(multStratData, trapData)
}
tags <- tempData[[2]] # get the tag rates for the PBT groups
multStratData$GSI <- paste0("GSIgroup_", multStratData$GSI)
multStratData$AdClip <- "AD" #flip from unclipped to clipped

head(multStratData)

Since you have actual data, if it's in the format you would use for scobi_deux, you can load it in using the below example commands:

#tag rates
tags <- read.csv("PBT tag rates.csv", stringsAsFactors = FALSE)
tags <- tags[tags[,1] != "Unassigned",] #remove Unassigned from tag rate dataframe

#trap data
multStratData <- read.csv("2018RearHscobi.csv", stringsAsFactors = FALSE)

#window counts
window <- read.csv("Clip_window.csv", stringsAsFactors = FALSE)

# The MCpbt function expects a strata column in the trap data. So next, we add it using the window count file.
### NOTE this code assumes you have a column named "WeekNumber"

multStratData$Strata <- NA #initialize column
for(i in 1:nrow(window)){
    multStratData$Strata[multStratData$WeekNumber == window[i,1]] <- window[i,3]
}

Second, prep the data for the estimation function

There are a few different ways you can have the function determine what the GSI assignment composition of PBT groups is. First decide whether to include data from other strata or not:

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in the current strata
For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in any strata

Now we need to prepare the data in the way the estimation function can use it. We will do this using a function in the package that takes the two dataframes we created above as input.

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in the current strata

## Not run in this vignette
#### important note: symPrior=0, this will make modifying the input in the next step easier
# note that we are using AI=FALSE because we are looking at the composition
#  of ad-clipped fish
mainInput <- prepStrata(multStratData, tags, GSIcol = "GSI", PBTcol = "GenParentHatchery", strataCol = "Strata", adFinCol = "AdClip", AI = FALSE, symPrior = 0)

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in any strata

#### important note: symPrior=0, this will make modifying the input in the next step easier
# note that we are using AI=FALSE because we are looking at the composition
#  of ad-clipped fish
mainInput <- prepStrata(multStratData, tags, GSIcol = "GSI", PBTcol = "GenParentHatchery", strataCol = "Strata", adFinCol = "AdClip", AI = FALSE, symPrior = 0, GSIgroups = sort(unique(multStratData$GSI)))

# now we are combining information about GSI assignments of PBT groups across strata
gsiCats <- sort(unique(multStratData$GSI))
pbtCats <- sort(unique(multStratData$GenParentHatchery))
pbtCats <- pbtCats[pbtCats != "Unassigned"]
gsiAllStrata <- matrix(0, nrow = length(pbtCats), ncol = length(gsiCats))
colnames(gsiAllStrata) <- gsiCats
rownames(gsiAllStrata) <- pbtCats
## here we are counting GSI assignments of all PBT-assigned fish in the dataset
for(p in pbtCats){
    for(g in gsiCats){
        gsiAllStrata[p,g] <- sum(multStratData$GenParentHatchery == p & multStratData$GSI == g)
    }
}

#here we are incorporation the assignemnts from other strata as a prior
for(i in 1:length(mainInput)){
    p <- rownames(mainInput[[i]]$prior_pi_gsi)
    g <- colnames(mainInput[[i]]$prior_pi_gsi)
    mainInput[[i]]$prior_pi_gsi <- gsiAllStrata[p,g] - mainInput[[i]]$ohnc_gsi
}

Now, we have to decide whether we want to consider that fish in a given PBT group may have GSI assignments that were not observed amongst the fish PBT-assigned to that group in our dataset. If we do, we can modify the prior to allow this. Below, it is shown how you can add 1/N to the alpha values of the relevant Dirichlet prior to accomplish this (where N is the number of GSI categories). Values other than 1/N can be added. The effect of using alternative values is beyond the scope of this vignette.

## NOT run in this vignette
for(i in 1:length(mainInput)){
    ng <- ncol(mainInput[[i]]$prior_pi_gsi)
    mainInput[[i]]$prior_pi_gsi <- mainInput[[i]]$prior_pi_gsi + (1/ng)
}

Now continue modifying the input

This has created an input list that has one item for each strata.

To run the estimates the way we want, we have to modify the prior and initial values for each strata. Essentially, we will tell it that all the GSIgroup_#'s are 0. This means any Unassigned fish must be untagged fish belonging to one of the PBT groups.

We set symPrior=0 earlier, which takes care of the prior, now we have to modify the initial values. We are setting them so that the initial values for all the "Unassigned" groups are 0.

for(i in 1:length(mainInput)){
    pbtGroups <- mainInput[[i]]$groups[1:mainInput[[i]]$nPBT]
    mainInput[[i]]$piTotInitial <- mainInput[[i]]$ohnc / sum(mainInput[[i]]$ohnc)
    for(j in 1:length(mainInput[[i]]$initZ)){
        tempGroups <- mainInput[[i]]$ohnc_gsi[,mainInput[[i]]$gsiUT[j]] + mainInput[[i]]$prior_pi_gsi[,mainInput[[i]]$gsiUT[j]]
        tempGroups <- which(tempGroups > 0)
        if(length(tempGroups) == 0) print("Error, GSI assignment not compatible with the prior or observed PBT-assigned fish")
        mainInput[[i]]$initZ[j] <- sample(tempGroups, 1)
    }
    mainInput[[i]]$oUTInitial <- rep(0, length(mainInput[[i]]$oUTInitial))
    for(g in 1:length(pbtGroups)){
        mainInput[[i]]$oUTInitial[g] <- sum(mainInput[[i]]$initZ == pbtGroups[g])
    }
}

Everything is now set up, so...

Calculate the estimates

# specifying a seed to make the anlaysis reproducible
# running 10000 iterations with 1000 burn-in and no thinning
propEstimates <- estimStrataMCpbt(mainInput, iter = 10000, burnIn = 1000, thin = 1, seed = 7)

We now have estimates in terms of proportions. You can look at convergence and mixing of the chain here if you want. One simple option is just to look at the trace of each groups' population size in each strata:

## This will create a pdf file with plots for you to look through
pdf("traces_to_check.pdf", width = 11, height = 8.5)
for(i in 1:length(propEstimates)){
    tempPiTot <- propEstimates[[i]]$piTot
    nPBT <- mainInput[[i]]$nPBT
    for(j in 1:nPBT){
        plot(tempPiTot[,j], type = "l", cex.axis = 2)
    }
}
dev.off()

We can now multiply by our population size to turn these into numbers of fish.

The function is set up to allow a user to propagate uncertainty in the escapement estimates. Since at BONN it is assumed the window count is the population size, we will just give it the same number over and over. This block of code takes the window count file (in scobi_deux format) and generates the window count numbers in the format the function needs.

window <- data.frame(weekNumber = c(1,2,3), count = c(3000,3000,3000), strata = c(1,2,3))

#make the window count input required
numMCMC <- nrow(propEstimates[[1]]$piTot) # this gets the number of MCMC samples we took
strataOrder <- sapply(mainInput, function(x) return(x$strataName))

escapementByStrata <- list()
for(s in strataOrder){
    total <- sum(window[window[,3] == s,2])
    escapementByStrata[[s]] <- rep(total, numMCMC)
}

Then, we use the built in function to multiply everything by the window count.

countEstimates <- multByEscapement(mainInput, mainRes = propEstimates, popSizes = escapementByStrata, writeSummary = FALSE)

What we are really interested in is just one entry of countEstimates, so let's pull it out and name it separately for ease.

runEstimate <- countEstimates$totPiTotEstim
head(runEstimate)

runEstimate has the estimated number of fish in each group for the total of the run in each iteration of the MCMC sampler that we saved. Notice that the GSIgroup_# are all 0, because we constrained them to be as such. Now, if you just want numbers and CIs for each hatchery, you can:

#point estimates
print("Point Estimates")
apply(runEstimate, 2, mean)

#CIs
print("CIs")
alpha <- .1
apply(runEstimate, 2, quantile, c(alpha/2, 1-(alpha/2)))

If you want to combine specific hatcheries into larger groups, you just add columns. For example, to get a point estimate and CI for pbtGroup1 + pbtGroup2:

#point estimates
print("Point Estimate")
mean(runEstimate[,"pbtGroup1"] + runEstimate[,"pbtGroup2"])

#CIs
print("CI")
alpha <- .1
quantile(runEstimate[,"pbtGroup1"] + runEstimate[,"pbtGroup2"], c(alpha, 1-(alpha/2)))

Adding columns manually can be pretty tedious if you have a lot of combining to do. So, let me know if that is the case, and I can help put together a block of code that will take an input file showing which groups to combine and automate the process.

delomast/fishCompTools documentation built on Jan. 11, 2021, 1:51 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

delomast/fishCompTools
Tools to Assist With Fish Composition Estimates, Particularly Using Parental Based Tagging (PBT)

In delomast/fishCompTools: Tools to Assist With Fish Composition Estimates, Particularly Using Parental Based Tagging (PBT)

Introduction

First, get some data

Second, prep the data for the estimation function

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in the current strata

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in any strata

Now continue modifying the input

Calculate the estimates

R Package Documentation

Browse R Packages

We want your feedback!

delomast/fishCompTools Tools to Assist With Fish Composition Estimates, Particularly Using Parental Based Tagging (PBT)

In delomast/fishCompTools: Tools to Assist With Fish Composition Estimates, Particularly Using Parental Based Tagging (PBT)

Introduction

First, get some data

Second, prep the data for the estimation function

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in the current strata

For a given PBT group, consider GSI assignments that are observed for fish in that PBT group in any strata

Now continue modifying the input

Calculate the estimates

R Package Documentation

Browse R Packages

We want your feedback!

delomast/fishCompTools
Tools to Assist With Fish Composition Estimates, Particularly Using Parental Based Tagging (PBT)