F.bootstrap.passage: F.bootstrap.passage
In tmcd82070/CAMP_RST: CAMP RST R Routines

F.bootstrap.passage

R Documentation

F.bootstrap.passage

Description

Bootstrap or Monte-Carlo simulate data sufficient to compute confidence intervals for passage.

Usage

F.bootstrap.passage(
  grand.df,
  catch.fits,
  catch.Xmiss,
  catch.gapLens,
  catch.bDates.miss,
  eff.fits,
  eff.X,
  eff.ind.inside,
  eff.X.dates,
  eff.X.obs.data,
  eff.type,
  sum.by,
  R,
  ci = T
)

Arguments

`grand.df`	A data frame containing both daily estimated passage and efficiency, for each trap.
`catch.fits`	A list of Poisson fitted `glm` objects for each trap, possibly with basis spline covariates, used to impute missing catches.
`catch.Xmiss`	A list containing a spline basis matrix of imputed days where catch is missing for each trap.
`catch.gapLens`	A list, for each trap, containing a numeric vector of hours of time spent `"Not fishing"`, originating from variable `TrapStatus` in original catch data queries. All values necessarily have entries less than 24.
`catch.bDates.miss`	A list containing a POSIX vector of `"Not fishing"` `batchDate`s for missing catches, for each trap. Necessary because one `batchDate` may have two (or more) periods with no fishing.
`eff.fits`	A list of binomial logistic regression fitted objects used to compute efficiency. One per trap.
`eff.X`	A list containing the basis matrix associated with each efficiency-spline model for each trap. These matrices originate from use of function `bs` in function `F.efficiency.model`.
`eff.ind.inside`	A list containing the first and last day of trapping for each trap.
`eff.X.dates`	A list containing the dates for which missing efficiency must be estimated, for each trap.
`eff.X.obs.data`	A list containing the raw observed efficiency data used to fit efficiency models and estimate bias-corrected efficiencies.
`eff.type`	A list containing the type of efficiency model utilized for each trap. See `eff_model.r` for details.
`sum.by`	A text string indicating the temporal unit over which daily estimated catch is to be summarized. Can be one of `day`, `week`, `month`, `year`.
`R`	An integer specifying the number of Monte Carlo iterations to do.
`ci`	A logical indicating if 95% bootstrapped confidence intervals should be estimated along with passage estimates.

Details

In order to bootstrap the estimated passage for a particular trap, random realizations of passage must be generated. Variability in passage can originate from two sources: imputed catch and imputed efficiency. Imputed catch originates from periods of "Not fishing" in excess of two hours, while imputed efficiency results from days between the first and last day of a recorded efficiency trial. Any one day may lead to several instances of imputed catch, but at most, only one instance of imputed efficiency. Since days of operation varies over different traps, the imputation periods vary as well.

Bootstrapping of each of catch and efficiency is organized via matrices of dimension \code{nrow(grand.df)} \times \code{R}, where rows hold unique trapping instances, and the columns the bootstrapping replicates. Because efficiency is only estimated on a per-day basis, but multiple trapping instances can take place on any one day, catch data are summarized per day following initial bootstrap sampling, with corresponding multiple intra-day replicates summed. Imputed values within each of the resulting daily catch and efficiency matrices thus contribute to underlying stochastic variability.

For each trap following sampling completion, the catch matrix is divided by the efficiency matrix, where the (i,j)th entry of the resulting passage sampling matrix corresponds to the jth passage replicate of the ith day. These daily passage estimates, over each replicate, are then summarized via function summarize.passage over the temporal unit specified via sum.by. In this way, R samples for each unique temporal time unit within the date range of grand.df are obtained.

Given the R replicates for each unique time period, 95% bias-corrected confidence intervals are obtained. These confidence intervals correct for non-symmetric passage replicates.

Value

A data frame containing 95% bias-adjusted confidence intervals for all unique temporal units summarized via specification of sum.by.

Variance Matrices

Catch models are fit via a Poisson generalized linear model. Often, these models are overdispersed, with a large Pearson overdispersion parameter, relative to one. Catch, however, often has a much higher-than-expected variance, due to seasonal fish pulses. To account for outliers in this case, the largest and smallest 20 residuals are removed, and the dispersion statistic recalculated. If instead, the dispersion statistic is less than one, it is set to one.

The modified overdispersion statistic is then multiplied by the variance-covariance matrix of the original model-fit; in this way, standard errors are recalculated via a modified quasililelihood approach.

Efficiency models are fit via a binomial generalized linear model. As a discrete model, these also can be overdispersed. However, the efficiency trial data are generally sparse. As a result, instead of removing the top proportion of residuals greater than some percentile magnitude, those greater than an absolute cut-off are removed instead. Here, any Pearson residual with an absolute value greater than 8 are removed. Following the removal of all extreme residuals, the resulting overdispersion is then calculated, and then applied to the variance-covariance of the original binomial fit. The resulting dispersion statistic is set to one in case it calculates as less than one. Traps with one efficiency trial also have overdispersions set to one. Similar to the variance adjustment applied to catch, this is a modified quasilikelihood approach.

In the case when there are less than ten observed efficiency trials for one trap, a bias-corrected efficiency is calculated in lieu of a model fit. This efficiency is calculated simply as the sum of the nCaught fish plus one, divided by the sum of the nReleased fish plus one. The plus-one manipulation prevents the direct estimation of variance via a formal generalized linear (or additive) model. In this case, bootstrap samples originate from a multivariate distribution with mean equal to the bias-corrected efficiency, and a variance equal to the traditional generalized linear model (glm) variance, but with back-transformed model-derived estimates of observed efficiencies replaced with their bias-corrected equivalents. See McCulloch and Searle (2001), or any other mathetmatical treatment of the generalized linear model, for details on the glm variance.

Random Realizations

Catch fit models are utilized to generate random realizations of catch for each individual trap. To do this, we use the mvtnorm::rmvnorm function to randomly sample from a multivariate normal distribution, with dimension equal to the number of β coefficients utilized in the trap's catch model. The mvtnorm::rmvnorm routine uses the vector of model coefficients as the mean of the multivariate distribution and the the modified quasilikelihood variance-covariance matrix for the variance. To speed calculations, we use the Cholesky matrix decomposition to calculate the variance matrix root. All betas and variances are on the log scale because the Poisson catch models assume a log link.

After generation, we use each of the R multivariate-normal samples to create a new prediction for missing catches. Random missing catches are expanded by the log of the trap down-time, i.e., trap down-time or gap in fishing is an offset. We expatiate the resulting imputed catch predictions and then combined with the observed catch to create a dataset containing a catch record for every day of the season.

Author(s)

WEST Inc.

References

Manly, B. F. J. Randomization, Bootstrap and Monte Carlo Methods in Biology, Third Edition, 2006. Chapman and Hall/CRC.

McCulloch, C. E. and Searle, S. R. Generalized, Linear, and Mixed Models, 2001. Wiley Interscience.

tmcd82070/CAMP_RST
CAMP RST R Routines

F.bootstrap.passage: F.bootstrap.passage
In tmcd82070/CAMP_RST: CAMP RST R Routines