knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(binom) library(BTSPAS) library(ggplot2) library(gridExtra) library(kableExtra) library(knitr) max.width=70
Because of the length of time needed to run the vignettes, only static vignettes have been included with this package.
The original of the vignettes and the code can be obtained from the GitHub site at https://github.com/cschwarz-stat-sfu-ca/BTSPAS
The two-sample capture-recapture experiment is one of the simplest possible studies with a long history. The standard Lincoln-Petersen estimator used to estimate abundance is $$ \widehat{N} = \frac{n_1 n_2}{m_2}$$ where $n_1$ is the number of animals captured, marked and released at the first capture event; $n_2$ is the number of animals captured at the second capture event; and $m_2$ is the number of animals from $n_1$ that were recaptured at the second event (i.e. the number of recaptured marked animals).
A key assumption of the Lincoln-Petersen estimator is that there is no correlation in capture-probabilities at the two events. The most common way in which this can occur is if the probability of capture is equal for all animals at either the first or second event.
If capture probabilities are heterogeneous, then estimates can be biased. One way to account for heterogeneous capture-probabilities is through stratification. For example, if males and females have different catchabilities, then separate Lincoln-Petersen estimators can be computed for each sex, and the estimated abundance for the entire population is found by summing the two estimated abundances (one for each sex).
Stratification can be based on animal attributes (e.g. sex), geographic location (e.g. parts of a study area), or temporal (e.g. when captured). The $BTSPAS$ package deals with temporal stratification.
Consider an experiment to estimate the number of outgoing smolts on a small river. The run of smolts extends over several weeks. As smolts migrate, they are captured and marked with individually numbered tags and released at the first capture location using, for example, a fishwheel. The migration continues, and a second fishwheel takes a second sample several kilometers down stream. At the second fishwheel, the captures consist of a mixture of marked (from the first fishwheel) and unmarked fish.
The efficiency of the fishwheels varies over time in response to stream flow, run size passing the wheel and other uncontrollable events. So it is unlikely that the capture probabilities are equal over time at either location, i.e. are heterogeneous over time.
We suppose that we can temporally stratify the data into, for example, weeks, where the capture-probabilities are (mostly) homogeneous at each wheel in each week. Furthermore, suppose that fish captured and marked in each week tend to migrate together so that they are captured in a single subsequent stratum. For example, suppose that in each julian week $j$, $n1[j]$ fish are marked and released above the rotary screw trap. Of these, $m2[j]$ are recaptured. All recaptures take place in the week of release, i.e. the matrix of releases and recoveries is diagonal. The $n1[j]$ and $m2[j]$ establish the capture efficiency of the second trap in julian week $j$.
At the same time, $u2[j]$ unmarked fish are captured at the screw trap.
This implies that the data can be structured as a diagonal array similar to:
Recovery Stratum tagged rs1 rs2 rs3 ... rsk Marking ms1 n1[1] m2[1] 0 0 ... 0 Stratum ms2 n1[2] 0 m2[2] 0 ... 0 ms3 n1[3] 0 0 m2[3] ... 0 ... msk n1[k] 0 0 0 ... m2[k] Newly Untagged u2[1] u2[2] u2[3] ... u2[k] captured
Here the tagging and recapture events have been stratified into $k$ temporal strata. Marked fish from one stratum tend to move at similar rates and so are recaptured together with unmarked fish. Recaptures of marked fish take place along the "diagonal."
Because the matrix is diagonal, and because the $u2$ vector is the same length as the $n1$ and $m2$ vectors, the data can be entered as several columns. Here is an example of some raw data:
demo.data.csv <- textConnection( 'jweek, n1, m2, u2 9 ,0, 0, 4135 10,1465, 51, 10452 11,1106,121, 2199 12, 229, 25, 655 13, 20, 0, 308 14, 177, 17, 719 15, 702, 74, 973 16, 633, 94, 972 17,1370, 62, 2386 18, 283, 10, 469 19, 647, 32, 897 20, 276, 11, 426 21, 277, 13, 407 22, 333, 15, 526 23,3981,242, 39969 24,3988, 55, 17580 25,2889,115, 7928 26,3119,198, 6918 27,2478, 80, 3578 28,1292, 71, 1713 29,2326,153, 4212 30,2528,156, 5037 31,2338,275, 3315 32,1012,101, 1300 33, 729, 66, 989 34, 333, 44, 444 35, 269, 33, 339 36, 77, 7, 107 37, 62, 9, 79 38, 26, 3, 41 39, 20, 1, 23 40,4757,188, 35118 41,2876, 8, 34534 42,3989, 81, 14960 43,1755, 27, 3643 44,1527, 30, 1811 45, 485, 14, 679 46, 115, 4, 154') demo.data <- read.csv(demo.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE) print(demo.data)
The first stratum was defined as julian week r demo.data$jweek[1]
.
At this time, r demo.data$n1[1]
fish were captured, tagged, and released, but
r demo.data$u2[1]
unmarked fish were recovered in the first recovery stratum.
In the next week, r demo.data$n1[2]
fish were captured, tagged, and released, with
r demo.data$m2[2]
fish recaptured, and
r demo.data$u2[2]
unmarked fish recovered.
Nhat <-BTSPAS::SimplePetersen( sum(demo.data$n1), sum(demo.data$n2), sum(demo.data$u2))
A pooled-Petersen estimator would add all of the marked, recaptured and unmarked fish
to give an estimate of r formatC( round(Nhat$N.est), digits=0, width=20, format="f",big.mark=",")
which seems unbelievable!
What went wrong? Let us first examine a plot of the estimated capture efficiency at the second trap for each (recovery) julian week.
# plot the estimated probability of recapture for each stratum along # with a confidence limit. Use use the binom package to find the ci when the # number of recaptures is zero demo.data$phat <- demo.data$m2/demo.data$n1 demo.data$phat.lcl <- binom::binom.confint(demo.data$m2, demo.data$n1, method="exact")$lower demo.data$phat.ucl <- pmin(0.3, binom::binom.confint(demo.data$m2, demo.data$n1, method='exact')$upper) ggplot(data=demo.data, aes(x=jweek, y=phat))+ ggtitle("Estimated recapture probabiity by julian week")+ geom_point()+ geom_line()+ geom_errorbar(aes(ymin=phat.lcl, ymax=phat.ucl), width=.1)+ ylab("Estimated recapture probability (95% ci)\nMean recapture probability in red")+ xlab('Julian week')+ ylim(0, .3)+ geom_hline(aes(yintercept=sum(demo.data$m2)/sum(demo.data$n1)), color="red")
There are several unusual features
Similarly, let us look at the pattern of unmarked fish captured at the second trap:
ggplot(data=demo.data, aes(x=jweek, y=log(u2)))+ ggtitle("Number of unmarked fish captured by julian week")+ geom_point()+ geom_line()+ xlab("Julian week")+ ylab("log(Number of unmarked fish captured (u2))")
There are two julian weeks where the number of unmarked fish captured suddenly jumps by several orders of magnitude (remember the above plot is on the log() scale). These jumps correspond to releases of hatchery fish into the system.
Finally, let us look at the individual estimates for each stratum found by computing a Petersen estimator for the total number of unmarked fish for each individual stratum:
demo.data$Uest <- BTSPAS::SimplePetersen( demo.data$n1, demo.data$m2, demo.data$u2)$U.est ggplot(data=demo.data, aes(x=jweek, y=log(Uest)))+ ggtitle("Estimated total unmarked fish captured by julian week")+ geom_point()+ geom_line()+ xlab("Julian week")+ ylab("log(Estimated total unmarked fish)")
The $BTSPAS$ package attempts to strike a balance between the completely pooled Petersen estimator and the completely stratified Petersen estimator. In the former, capture probabilities are assumed to equal for all fish in all strata, while in the latter, capture probabilities are allowed to vary among strata in no structured way. Furthermore, fish populations often have a general structure to the run, rather than arbitrarily jumping around from stratum to stratum.
Bonner and Schwarz (2011) developed a suite of models that add structure. In the basic model,
The model also allows the user to use covariates to explain some of the variation in the capture probabilities. Bonner and Schwarz (2011) also developed models where fish are recaptured in more than stratum.
The $BTSPAS$ package also has additional features and options:
I find it easiest if bad recaptures (a value of $m2$) result in zeroing out both $n1$ and $m2$ for that stratum.
The $BTSPAS$ function also allows you specify
We already read in the data above. Here we set the rest of the parameters. Don't forget to set the working directory as appropriate
library("BTSPAS") # After which weeks is the spline allowed to jump? demo.jump.after <- c(22,39) # julian weeks after which jump occurs # Which julian weeks have "bad" recapture values. These will be set to missing and estimated. demo.bad.m2 <- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example demo.bad.u2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.] demo.bad.n1 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.] # The prefix for the output files: demo.prefix <- "demo-JC-2003-CH-TSPDE" # Title for the analysis demo.title <- "Junction City 2003 Chinook " cat("*** Starting ",demo.title, "\n\n") # Make the call to fit the model and generate the output files demo.fit <- TimeStratPetersenDiagError_fit( title=demo.title, prefix=demo.prefix, time=demo.data$jweek, n1=demo.data$n1, m2=demo.data$m2, u2=demo.data$u2, jump.after=demo.jump.after, bad.n1=demo.bad.n1, bad.m2=demo.bad.m2, bad.u2=demo.bad.u2, InitialSeed=890110, debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking. save.output.to.files=FALSE)
The final parameter (save.output.to.files) can be set to automatically to save plots and reports in files with the appropriate prefix in the working directory.
# delete extra files that were created file.remove("data.txt" ) file.remove("CODAindex.txt" ) file.remove("CODAchain1.txt" ) file.remove("CODAchain2.txt" ) file.remove("CODAchain3.txt" ) file.remove("inits1.txt" ) file.remove("inits2.txt" ) file.remove("inits3.txt" ) file.remove("model.txt" )
The output object contains all of the results and can be saved for later interrogations. This is useful if the run takes considerable time (e.g. overnight) and you want to save the results for later processing. Notice that I didn't save the results below as part of this vignette.
# save the results in a data dump that can be read in later using the load() command. #save(list=c("demo.fit"), file="demo-fit-saved.Rdata") # save the results from this run
The final object has many components
names(demo.fit)
save.options <- options() options(width=max.width) names(demo.fit) options(save.options)
The plots sub-object contains many plots:
names(demo.fit$plots)
save.options <- options() options(width=max.width) names(demo.fit$plots) options(save.options)
In particular, it contains plots of the initial spline fit (init.plot), the final fitted spline (fit.plot), the estimated capture probabilities (on the logit scale) (logitP.plot), plots of the distribution of the posterior sample for the total unmarked and marked fish (post.UNtot.plot) and model diagnostic plots (goodness of fit (gof.plot), trace (trace...plot), and autocorrelation plots (act.Utot.plot).
These plots are all created using the $ggplot2$ packages, so the user can modify the plot (e.g. change titles etc).
The $BTSPAS$ program also creates a report, which includes information about the data used in the fitting, the pooled- and stratified-Petersen estimates, a test for pooling, and summaries of the posterior. Only the first few lines are shown below:
head(demo.fit$report)
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample
demo.fit$plots$fit.plot
The jump in the spline when hatchery fish are released is evident. The actual number of unmarked fish is allowed to vary around the spline as shown below.
The distribution of the posterior sample for the total number unmarked and total abundance is available:
demo.fit$plots$post.UNtot.plot
A plot of the $logit(P)$ is
demo.fit$plots$logitP.plot
In cases where there is no information, $BTSPAS$ has interpolated based on the distribution of catchability in the other strata and so the credible interval is very wide (e.g. julian weeks 7, 13, and 41).
A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:
demo.fit$summary[ row.names(demo.fit$summary) %in% c("Ntot","Utot"),]
This also includes the Rubin-Brooks-Gelman statistic ($Rhat$) on mixing of the chains and the effective sample size of the posterior (after accounting for autocorrelation).
The estimated total abundance from $BTSPAS$ is
r formatC(round(demo.fit$summary[ "Ntot","mean"]), big.mark=",", digits=0, format="f")
(SD
r formatC(round(demo.fit$summary[ "Ntot","sd" ]), big.mark=",", digits=0, format="f")
) fish.
Samples from the posterior are also included in the sims.matrix, sims.array and sims.list elements of the results object.
It is always important to do model assessment before accepting the results from the model fit. Please contact me for details on how to interpret the goodness of fit, trace, and autocorrelation plots.
In some cases, the second trap is not running and so there are no recaptures of tagged fish and no captures of untagged fish.
We need to set the p's in these strata to 0 rather than letting BTSPAS impute a value.
Here is an example of some raw data that is read in:
demo2.data.csv <- textConnection( 'jweek, n1, m2, u2 9 ,0, 0, 4135 10,1465, 51, 10452 11,1106,121, 2199 12, 229, 25, 655 13, 20, 0, 308 14, 177, 17, 719 15, 702, 74, 973 16, 633, 94, 972 17,1370, 62, 2386 18, 283, 10, 469 19, 647, 32, 897 20, 276, 11, 426 21, 277, 13, 407 22, 333, 15, 526 23,3981,242, 39969 24,3988, 55, 17580 25,2889,115, 7928 26,3119,198, 6918 27,2478, 80, 3578 28,1292, 71, 1713 29,2326,153, 4212 30,2528,156, 5037 31,2338,275, 3315 32,1012,101, 1300 33, 729, 66, 989 34, 333, 44, 444 35, 269, 33, 339 36, 77, 7, 107 37, 62, 0, 0 38, 26, 0, 0 39, 20, 0, 0 40,4757,188, 35118 41,2876, 8, 34534 42,3989, 81, 14960 43,1755, 27, 3643 44,1527, 30, 1811 45, 485, 14, 679 46, 115, 0, 0') demo2.data <- read.csv(demo2.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE) print(demo2.data)
weeks.with.zero <- demo2.data$jweek[ demo2.data$m2==0 & demo2.data$u2==0] weeks.with.zero.text <- paste(paste( weeks.with.zero[-length(weeks.with.zero)],', ', sep="", collapse=""), "and ", weeks.with.zero[length(weeks.with.zero)], sep="", collapse="" )
Notice that there are no recaptures of marked fish and no recaptures of unmarked fish in julian weeks r weeks.with.zero.text
when the trap was not operating.
Notice that this differs from the case where a small number of tagged fish released
with no recaptures but captures of tagged fish occur such as in julian week 13.
Two additional arguments to the BTSPAS allow you specify the julian weeks in which the capture probability is fixed to a known (typically zero) value.
demo2.logitP.fixed <- c(37,38,39, 46) demo2.logitP.fixed.values <- rep(-10, length(demo2.logitP.fixed))
The strata where the value of p is to be fixed is specified along with the value (on the logit scale) at which the capture probabilities are fixed. Technically, the $logit(0.0)$ is negative infinity, but $-10$ is ``close enough''.
The rest of the call is basically the same -- don't forget to specify the additional arguments in the call
library("BTSPAS") # After which weeks is the spline allowed to jump? demo2.jump.after <- c(22,39) # julian weeks after which jump occurs # Which julian weeks have "bad" recapture values. These will be set to missing and estimated. demo2.bad.m2 <- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example demo2.bad.u2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.] demo2.bad.n1 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.] # The prefix for the output files: demo2.prefix <- "demo2-JC-2003-CH-TSPDE" # Title for the analysis demo2.title <- "Junction City 2003 Chinook with p fixed " cat("*** Starting ",demo2.title, "\n\n") # Make the call to fit the model and generate the output files demo2.fit <- TimeStratPetersenDiagError_fit( title=demo2.title, prefix=demo2.prefix, time=demo2.data$jweek, n1=demo2.data$n1, m2=demo2.data$m2, u2=demo2.data$u2, jump.after=demo2.jump.after, logitP.fixed=demo2.logitP.fixed, # ***** NEW ****8 logitP.fixed.values=demo2.logitP.fixed.values, # ***** NEW ***** bad.n1=demo2.bad.n1, bad.m2=demo2.bad.m2, bad.u2=demo2.bad.u2, InitialSeed=890110, debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking. save.output.to.files=FALSE)
# delete extra files that were created file.remove("data.txt" ) file.remove("CODAindex.txt" ) file.remove("CODAchain1.txt" ) file.remove("CODAchain2.txt" ) file.remove("CODAchain3.txt" ) file.remove("inits1.txt" ) file.remove("inits2.txt" ) file.remove("inits3.txt" ) file.remove("model.txt" )
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample.
Note how the spline interpolates across the julian weeks when no unmarked fish were captured in julian
weeks r weeks.with.zero.text
but the uncertainty is much larger.
demo2.fit$plots$fit.plot
The jump in the spline when hatchery fish are released is evident.
The distribution of the posterior sample for the total number unmarked and total abundance is available as before:
demo2.fit$plots$post.UNtot.plot
A plot of the $logit(P)$ is
demo2.fit$plots$logitP.plot
Notice how the fixed values at $-10$ (on the logit scale) occur.
A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:
demo2.fit$summary[ row.names(demo2.fit$summary) %in% c("Ntot","Utot"),]
The estimated total abundance from $BTSPAS$ is
r formatC(round(demo2.fit$summary[ "Ntot","mean"]), big.mark=",", digits=0, format="f")
(SD
r formatC(round(demo2.fit$summary[ "Ntot","sd" ]), big.mark=",", digits=0, format="f")
) fish.
BTSPAS also allows you to model the p's with additional covariates, such a temperature, stream flow, etc. It is not possible to use covariates to model the total number of unmarked fish.
Here is an example of some raw data that includes the covariate $log(flow)$:
demo3.data.csv <- textConnection( 'jweek, n1, m2, u2, logflow 9, 0, 0, 4135, 6.617212 10, 1465, 51, 10452, 6.51217 11, 1106, 121, 2199, 7.193686 12, 229, 25, 655, 6.960754 13, 20, 0, 308, 7.008376 14, 177, 17, 719, 6.761573 15, 702, 74, 973, 6.905753 16, 633, 94, 972, 7.062314 17, 1370, 62, 2386, 7.600188 18, 283, 10, 469, 8.246509 19, 647, 32, 897, 8.110298 20, 276, 11, 426, 8.035001 21, 277, 13, 407, 7.859965 22, 333, 15, 526, 7.774255 23, 3981, 242, 39969, 7.709116 24, 3988, 55, 17580, 7.653766 25, 2889, 115, 7928, 7.622105 26, 3119, 198, 6918, 7.593734 27, 2478, 80, 3578, 7.585063 28, 1292, 71, 1713, 7.291072 29, 2326, 153, 4212, 6.55556 30, 2528, 156, 5037, 6.227665 31, 2338, 275, 3315, 6.278789 32, 1012, 101, 1300, 6.273685 33, 729, 66, 989, 6.241111 34, 333, 44, 444, 6.687999 35, 269, 33, 339, 7.222566 36, 77, 7, 107, 7.097194 37, 62, 9, 79, 6.949993 38, 26, 3, 41, 6.168714 39, 20, 1, 23, 6.113682 40, 4757, 188, 35118, 6.126557 41, 2876, 8, 34534, 6.167217 42, 3989, 81, 14960, 5.862413 43, 1755, 27, 3643, 5.696614 44, 1527, 30, 1811, 5.763847 45, 485, 14, 679, 5.987528 46, 115, 4, 154, 5.912344') demo3.data <- read.csv(demo3.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE) print(demo3.data)
A preliminary plot of the empirical logit (excluding those weeks when the trap was not running) shows an approximate quadratic fit to $log(flow)$, but the uncertainty in each week is enormous!
demo3.data$elogitphat <- log( (demo3.data$m2+.5)/(demo3.data$n1+1) / (1 - (demo3.data$m2+.5)/(demo3.data$n1+1) )) plotdata <- demo3.data[ demo3.data$u2 >0, ] ggplot(data=plotdata, aes(x=logflow, y=elogitphat))+ ggtitle("Empirical logit vs. log(flow) ")+ geom_point()+ geom_smooth(method="lm", se=FALSE, formula = y ~ x + I(x^2))
We need to create a matrix with the covariate values. We will need three columns - one for the intercept, the value of log(flow) and the square of its values. In practice, it is often advisable to standardize covariates to prevent numerical difficulties, but in this case, the values are small enough that standardization is not really needed.
demo3.logitP.cov <- cbind(1, demo3.data$logflow, demo3.data$logflow^2) head(demo3.logitP.cov)
The rest of the call is basically the same -- don't forget to specify the additional arguments in the call
library("BTSPAS") # After which weeks is the spline allowed to jump? demo3.jump.after <- c(22,39) # julian weeks after which jump occurs # Which julian weeks have "bad" recapture values. These will be set to missing and estimated. demo3.bad.m2 <- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example demo3.bad.u2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.] demo3.bad.n1 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.] # The prefix for the output files: demo3.prefix <- "demo3-JC-2003-CH-TSPDE" # Title for the analysis demo3.title <- "Junction City 2003 Chinook with covariates for p " cat("*** Starting ",demo3.title, "\n\n") # Make the call to fit the model and generate the output files demo3.fit <- TimeStratPetersenDiagError_fit( title=demo3.title, prefix=demo3.prefix, time=demo3.data$jweek, n1=demo3.data$n1, m2=demo3.data$m2, u2=demo3.data$u2, jump.after=demo3.jump.after, logitP.cov = demo3.logitP.cov, # ***** NEW ***** bad.n1=demo3.bad.n1, bad.m2=demo3.bad.m2, bad.u2=demo3.bad.u2, InitialSeed=890110, debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking. save.output.to.files=FALSE)
# delete extra files that were created file.remove("data.txt" ) file.remove("CODAindex.txt" ) file.remove("CODAchain1.txt" ) file.remove("CODAchain2.txt" ) file.remove("CODAchain3.txt" ) file.remove("inits1.txt" ) file.remove("inits2.txt" ) file.remove("inits3.txt" ) file.remove("model.txt" )
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample.
demo3.fit$plots$fit.plot
The jump in the spline when hatchery fish are released is evident.
The distribution of the posterior sample for the total number unmarked and total abundance is available as before:
demo3.fit$plots$post.UNtot.plot
A plot of the $logit(P)$ is
demo3.fit$plots$logitP.plot
Here is a plot of the estimated $logit(p)$'s vs. the log(flow) and its fitted curve:
demo3.row.names <- rownames(demo3.fit$summary) demo3.coeff.row.index <- grep("beta.logitP[", demo3.row.names, fixed=TRUE) demo3.coeff.row.index <- demo3.coeff.row.index[1:3] # the 3rd index is a dummy value needed when there is a single beta demo3.coeff <- demo3.fit$summary[demo3.coeff.row.index,"mean"] demo3.coeff.sd <- demo3.fit$summary[demo3.coeff.row.index, "sd"] demo3.pred.logitP <- demo3.logitP.cov %*% demo3.coeff demo3.logitP.row.index <- grep("^logitP", demo3.row.names) demo3.logitP <- demo3.fit$summary[demo3.logitP.row.index, "mean"] # extract the logit(P) values plotdata <- data.frame(jweek =demo3.data$jweek, logflow =demo3.data$logflow, pred =demo3.pred.logitP, actual =demo3.logitP, actual.lcl=demo3.fit$summary[demo3.logitP.row.index, "2.5%"], actual.ucl=demo3.fit$summary[demo3.logitP.row.index, "97.5%"]) ggplot(data=plotdata, aes(x=logflow, y=pred))+ ggtitle("Relationship of logit(p) and flow")+ geom_line()+ geom_errorbar(aes(ymin=actual.lcl, ymax=actual.ucl), color="blue", width=.01)+ geom_point(aes(y=actual),color="blue")+ ylab("logit(p) and 95% ci")+xlab("log(flow)")
There is virtually no evidence of a relationship with flow because of the very large uncertainties in each of the estimated $logit(p)$.
The estimated coefficients of the quadratic relationship between logit(p) and log(flow) are:
round(demo3.fit$summary[demo3.coeff.row.index,],3)
A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:
demo3.fit$summary[ row.names(demo3.fit$summary) %in% c("Ntot","Utot"),]
The estimated total abundance from $BTSPAS$ is
r formatC(round(demo3.fit$summary[ "Ntot","mean"]), big.mark=",", digits=0, format="f")
(SD
r formatC(round(demo3.fit$summary[ "Ntot","sd" ]), big.mark=",", digits=0, format="f")
) fish.
BTSPAS also allows you to model the p's with additional covariates, such a temperature, stream flow, etc. It is not possible to use covariates to model the total number of unmarked fish.
In some cases, you may have additional information about the effect of the covariates that you would like to incorporate into the analysis. For example, a rotary screw trap may have run for many years and plots of the relationship between the logit(catchability) vs. log(flow) generally shows a relationship that you may not be able to capture in a single year because of lack of contrast (i.e. the flow within a year does not vary enough) or because of smallish sample sizes.
BTSPAS allows you to specify prior information on the coefficients of the relationship between logit(catchability) and covariates.
We will show an example where four models are fit to the same data
Here is an example of some (fictitious) raw data that includes the covariate $log(flow)$:
demo5.data.csv <- textConnection( 'time, n1, m2, u2, logFlow 1 , 0 , 0 , 21 , 5.415 2 , 56 , 8 , 2266 , 5.358 3 , 1009 , 59 , 11314 , 6.737 4 , 1284 , 25 , 5035 , 7.993 5 , 1504 , 13 , 396 , 8.693 6 , 0 , 0 , 45 , 8.861 7 , 0 , 0 , 26 , 8.587 8 , 1560 , 17 , 12 , 8.347 9 , 1643 , 14 , 43 , 8.260 10 , 0 , 0 , 63 , 8.606 11 , 1487 , 7 , 24 , 8.671 12 , 0 , 0 , 5 , 8.737 13 , 0 , 0 , 4 , 7.862') demo5.data <- read.csv(demo5.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE) print(demo5.data)
A preliminary plot of the empirical logit (excluding those weeks when the trap was not running) shows an approximate linear fit to $log(flow)$, but the uncertainty in each week is enormous!
demo5.data$elogitphat <- log( (demo5.data$m2+.5)/(demo5.data$n1+1) / (1 - (demo5.data$m2+.5)/(demo5.data$n1+1) )) plotdata <- demo5.data[ demo5.data$u2 >0, ] ggplot(data=plotdata, aes(x=logFlow, y=elogitphat))+ ggtitle("Empirical logit vs. log(flow) ")+ geom_point()+ geom_smooth(method="lm", se=FALSE, formula = y ~ x)
We start with fitting BTSPAS with no covariates
fit1 <- BTSPAS::TimeStratPetersenDiagError_fit( title="no covariates" ,prefix="fit1" ,time = demo5.data$time ,n1 = demo5.data$n1 ,m2 = demo5.data$m2 ,u2 = demo5.data$u2 ,InitialSeed=234323 ,save.output.to.files=FALSE )
Estimated values of the overall mean logitP and the residual variation around the fitted line are:
select <- grepl("beta.logitP", row.names(fit1$summary)) fit1$summary[select,][1] select <- grepl("sigmaP", row.names(fit1$summary)) # sd(logitP) around mean logitP or regression curve if covariates present fit1$summary[select,]
The plot of the estimated logit(catchability) vs. log(flow) and the overall mean is:
plotdata1 <- data.frame( logFlow = demo5.data$logFlow, logitP = fit1$mean$logitP, time = demo5.data$time, n1 = demo5.data$n1) plotdata1$fit <- 'fit1 - nocovariates' ggplot(data=plotdata1, aes(x=logFlow, y=logitP))+ ggtitle("no covariates")+ geom_point(aes(size=n1))+ #geom_smooth(method="lm", se=FALSE)+ geom_text(aes(label=time), vjust=-.5)+ geom_hline(yintercept=fit1$mean$beta.logitP[1], color="red")
The appears to be a relationship with log(flow) that has not been captured with this model.
We need to create a matrix with the covariate values. We will need two columns - one for the intercept, and one for the value of log(flow). In practice, it is often advisable to standardize covariates to prevent numerical difficulties, but in this case, the values are small enough that standardization is not really needed.
fit2 <- BTSPAS::TimeStratPetersenDiagError_fit( title="log(Flow)- default prior" ,prefix="fit2" ,time = demo5.data$time ,n1 = demo5.data$n1 ,m2 = demo5.data$m2 ,u2 = demo5.data$u2 ,logitP.cov=cbind( 1, demo5.data$logFlow) ,InitialSeed=3542343 ,save.output.to.files=FALSE )
Estimated values of the beta coefficients (the intercept and slope) and the residual variation around the fitted line are:
select <- grepl("beta.logitP", row.names(fit2$summary)) fit2$summary[select,][1:2] select <- grepl("sigmaP", row.names(fit2$summary)) # sd(logitP) around mean logitP or regression curve if covariates present fit2$summary[select,]
The plot of the estimated logit(catchability) vs. log(flow) and the overall mean is:
plotdata2 <- data.frame(logFlow = demo5.data$logFlow, logitP = fit2$mean$logitP, time = demo5.data$time, n1 = demo5.data$n1) plotdata2$fit <- 'fit2 - log flow default priors' ggplot(data=plotdata2, aes(x=logFlow, y=logitP))+ ggtitle("log(flow) - default priors")+ geom_point(aes(size=n1))+ #geom_smooth(method="lm", se=FALSE)+ geom_text(aes(label=time), vjust=-.5)+ #geom_hline(yintercept=fit1$mean$beta.logitP[1], color="red")+ geom_abline(intercept=fit2$mean$beta.logitP[1], slope =fit2$mean$beta.logitP[2], color="green")
We now have a relationship with log(flow), but as you will see later, the evidence is not very strong.
Prior information on the beta coefficients (the intercept and slope) are given using the prior.beta.logitP.mean and prior.beta.logitP.sd parameters in the call. The first specifies the values of the intercept and slope and the second specifies the uncertainty in these prior values.
Consider the fit:
fit3 <- BTSPAS::TimeStratPetersenDiagError_fit( title="log(Flow) - weak priors" ,prefix="fit3" ,time = demo5.data$time ,n1 = demo5.data$n1 ,m2 = demo5.data$m2 ,u2 = demo5.data$u2 ,logitP.cov=cbind( 1, demo5.data$logFlow) ,prior.beta.logitP.mean=c( .3, -.7) # prior for intercept and slope on logit scale - mean ,prior.beta.logitP.sd =c( 2, 2) # prior for intercept and slope on logit scale - sd ,InitialSeed=3542343 ,save.output.to.files=FALSE )
Here the prior for the intercept is set to 0.3 and the prior for the slope is set to -.7 (the prior.beta.logitP.mean).
The prior.beta.logitP.sd gives the uncertainty (like a standard error) in these values. In this fit, I used a large uncertainty, a standard deviation of 2 for both. This indicates the prior puts information on $.3 \pm 2 \times 2$ for the intercept and $-7 \pm 2 \times 2$ for the slope, i.e. about 95% of the information in the prior within two standard deviations of the mean.
Estimated values of the beta coefficients (the intercept and slope) and the residual variation around the fitted line are:
select <- grepl("beta.logitP", row.names(fit3$summary)) fit3$summary[select,][1:2] select <- grepl("sigmaP", row.names(fit3$summary)) # sd(logitP) around mean logitP or regression curve if covariates present fit3$summary[select,]
The plot of the estimated logit(catchability) vs. log(flow) and the overall mean is:
plotdata3 <- data.frame( logFlow = demo5.data$logFlow, logitP = fit3$mean$logitP, time = demo5.data$time, n1 = demo5.data$n1) plotdata3$fit <- 'fit3 - log(flow) - weak priors' ggplot(data=plotdata3, aes(x=logFlow, y=logitP))+ ggtitle("log(flow) - weak priors")+ geom_point(aes(size=n1))+ #geom_smooth(method="lm", se=FALSE)+ geom_text(aes(label=time), vjust=-.5)+ #geom_hline(yintercept=fit1$mean$beta.logitP[1], color="red")+ #geom_abline(intercept=fit2$mean$beta.logitP[1], # slope =fit2$mean$beta.logitP[2], color="green")+ geom_abline(intercept=fit3$mean$beta.logitP[1], slope =fit3$mean$beta.logitP[2], color="brown", linetype="solid")
The weak information on the slope gives a boost to the relationship between log(flow) and the the logit(catchability) especially for those weeks when the sample size is very small.
We repeat the fit but with very strong prior information:
fit4 <- BTSPAS::TimeStratPetersenDiagError_fit( title="log(Flow) - strong priors" ,prefix="fit4" ,time = demo5.data$time ,n1 = demo5.data$n1 ,m2 = demo5.data$m2 ,u2 = demo5.data$u2 ,logitP.cov=cbind( 1, demo5.data$logFlow) ,prior.beta.logitP.mean=c( .3, -.7) # prior for intercept and slope on logit scale - mean ,prior.beta.logitP.sd =c(.01, .01) # prior for intercept and slope on logit scale - sd ,InitialSeed=3542343 ,save.output.to.files=FALSE )
Here the prior for the intercept is set to 0.3 and the prior for the slope is set to -.7 (the prior.beta.logitP.mean).
The prior.beta.logitP.sd gives the uncertainty (like a standard error) in these values. In this fit, I used a very small uncertainty, a standard deviation of .01 for both. This indicates the prior puts information on $.3 \pm 2 \times .01$ for the intercept and $-7 \pm 2 \times .01$ for the slope, i.e. about 95% of the information in the prior within two standard deviations of the mean.
Estimated values of the beta coefficients (the intercept and slope) and the residual variation around the fitted line are:
select <- grepl("beta.logitP", row.names(fit4$summary)) fit4$summary[select,][1:2] select <- grepl("sigmaP", row.names(fit4$summary)) # sd(logitP) around mean logitP or regression curve if covariates present fit4$summary[select,]
The plot of the estimated logit(catchability) vs. log(flow) and the overall mean is:
plotdata4 <- data.frame( logFlow = demo5.data$logFlow, logitP = fit4$mean$logitP, time = demo5.data$time, n1 = demo5.data$n1) plotdata4$fit <- 'fit4 - log(flow) - strong priors' ggplot(data=plotdata4, aes(x=logFlow, y=logitP))+ ggtitle("log(flow) - strong prior")+ geom_point(aes(size=n1))+ #geom_smooth(method="lm", se=FALSE)+ geom_text(aes(label=time), vjust=-.5)+ #geom_hline(yintercept=fit1$mean$beta.logitP[1], color="red")+ #geom_abline(intercept=fit2$mean$beta.logitP[1], # slope =fit2$mean$beta.logitP[2], color="green")+ #geom_abline(intercept=fit3$mean$beta.logitP[1], # slope =fit3$mean$beta.logitP[2], color="brown", linetype="solid")+ geom_abline(intercept=fit4$mean$beta.logitP[1], slope =fit4$mean$beta.logitP[2], color="brown", linetype="dashed")
Notice that the (strong) prior is now in conflict with the data. The model now believes that the variation around the line must be large, which allows it to move estimates of logit(catchability) below the line.
select <- grepl("Ntot", row.names(fit1$summary)) t1 <- data.frame(as.list(fit1$summary[select,])) t1$Source <- "No covariates" select <- grepl("Ntot", row.names(fit2$summary)) t2 <- data.frame(as.list(fit2$summary[select,])) t2$Source <- "Default prior" select <- grepl("Ntot", row.names(fit3$summary)) t3 <- data.frame(as.list(fit3$summary[select,])) t3$Source <- "Weak prior" select <- grepl("Ntot", row.names(fit4$summary)) t4 <-data.frame(as.list(fit4$summary[select,])) t4$Source <- "Strong prior" temp <- rbind(t1, t2, t3, t4) kable(temp[,c("Source","mean","sd")], row.names=FALSE, caption="Comparing estimates of abundance among fit", col.names=c("Source","Mean","SD"), digits=c(0,0,0)) %>% column_spec(column=c(1), width="3cm" ) %>% column_spec(column=c(2,3), width="1.5cm") %>% kable_styling("bordered",position = "center", full_width=FALSE, latex_options = "HOLD_position")
There appears to be little impact on the estimate of abundance, but notice that the uncertainty declines as you add information from fit1 to *fit3(), but when you add a strong conflicting prior (see below), the uncertainty now increases.
select <- grepl("beta.logitP", row.names(fit1$summary)) t1 <- data.frame(as.list(fit1$summary[select,][1,])) t1$Source <- "No covariates" t1$Est <- "Intercept" select <- grepl("beta.logitP", row.names(fit2$summary)) t2 <- data.frame(fit2$summary[select,][1:2,]) t2$Source <- "Default prior" t2$Est <- c("Intercept","Slope") select <- grepl("beta.logitP", row.names(fit3$summary)) t3 <- data.frame(fit3$summary[select,][1:2,]) t3$Source <- "Weak prior" t3$Est <- c("Intercept","Slope") select <- grepl("beta.logitP", row.names(fit4$summary)) t4 <-data.frame(fit4$summary[select,][1:2,]) t4$Source <- "Strong prior" t4$Est <- c("Intercept","Slope") temp <- rbind(t1, t2, t3, t4) kable(temp[,c("Source","Est","mean","sd")], row.names=FALSE, caption="Comparing estimates of beta coefficients among fit", col.names=c("Source","Estimate","Mean","SD"), digits=c(0,0,3,3)) %>% column_spec(column=c(1:2), width="3cm" ) %>% column_spec(column=c(3,4), width="1.5cm") %>% kable_styling("bordered",position = "center", full_width=FALSE, latex_options = "HOLD_position")
With the strong prior, the data plays essentially no role in determining the slope and intercept. With a weak prior, the estimated slope has lower precision than with the even weaker (default) prior.
select <- grepl("sigmaP", row.names(fit1$summary)) t1 <- data.frame(as.list(fit1$summary[select,])) t1$Source <- "No covariates" select <- grepl("sigmaP", row.names(fit2$summary)) t2 <- data.frame(as.list(fit2$summary[select,])) t2$Source <- "Default prior" select <- grepl("sigmaP", row.names(fit3$summary)) t3 <- data.frame(as.list(fit3$summary[select,])) t3$Source <- "Weak prior" select <- grepl("sigmaP", row.names(fit4$summary)) t4 <-data.frame(as.list(fit4$summary[select,])) t4$Source <- "Strong prior" temp <- rbind(t1, t2, t3, t4) kable(temp[,c("Source","mean","sd")], row.names=FALSE, caption="Comparing estimates of residual variation in logitP among fit", col.names=c("Source","Mean","SD"), digits=c(0,3,3)) %>% column_spec(column=c(1), width="3cm" ) %>% column_spec(column=c(2,3), width="1.5cm") %>% kable_styling("bordered",position = "center", full_width=FALSE, latex_options = "HOLD_position")
The residual variation declines as more information is added via the prior for the first 3 fits, but then increases when a strong (conflicting) prior is added (last fit).
# fitted logit(p) vs covariate for each model plotdata <- plyr::rbind.fill(plotdata1, plotdata2, plotdata3, plotdata4) ggplot(data=plotdata, aes(x=logFlow, y=logitP))+ ggtitle("Fitted logitP vs log(flow) - each model")+ geom_point(aes(size=n1))+ #geom_smooth(method="lm", se=FALSE)+ geom_text(aes(label=time), vjust=-.5)+ geom_hline(yintercept=fit1$mean$beta.logitP[1], color="red")+ geom_abline(intercept=fit2$mean$beta.logitP[1], slope =fit2$mean$beta.logitP[2], color="green")+ geom_abline(intercept=fit3$mean$beta.logitP[1], slope =fit3$mean$beta.logitP[2], color="brown", linetype="solid")+ geom_abline(intercept=fit4$mean$beta.logitP[1], slope =fit4$mean$beta.logitP[2], color="brown", linetype="dashed")+ facet_wrap(~fit, ncol=2,dir="v")
In the plots of logitP vs. log(flow), the red horizontal line is from fit1 (no covariates) and represents the mean. The green line is from the fit of logitP vs logFlow using the default prior. The solid brown line is from the fit of logitP vs logFlow for the weak prior and the dashed brown line is for the strong prior. The size of the dots represents n1 the number of fish released.
So for fit1, there is very little data for time 1 and time 2, and and so their logitP are basically free to vary. For fit2, the default prior gives some (but not much information) about the relationship between logitP and logFlow, so the estimates of logitP for periods 1 and 2 are moved "closer" to the green line. For fit3, the weak prior gives more information and so the points move very close to the line. As well, the periods with lots of data can be well fit with very little noise and so the model says the the noise of periods in logitP must also be small, and so all the points are forced to lie on the line. For fit4, the very strong prior is placed on a WRONG line (the dashed brown). Now the model is in a quandary and says that the only way the logitP for weeks 3, 4, etc. with lots of data are consistent with the dashed brown line is if the among period variance is very large and so the logitP for weeks with very poor data are allowed to move away from the brown dashed line.
We can also compare the spline fit to the number of unmarked fish:
# Look at the spline fits for each model allplots <- gridExtra::arrangeGrob( fit1$plots$fit.plot, fit2$plots$fit.plot, fit3$plots$fit.plot, fit4$plots$fit.plot, ncol=2, as.table=FALSE) plot(allplots)
The spline is only affected slightly.
We can also compare the trend of logitP over time:
# Look at the logitP over time fits for each model allplots <- gridExtra::arrangeGrob( fit1$plots$logitP.plot, fit2$plots$logitP.plot, fit3$plots$logitP.plot, fit4$plots$logitP.plot, ncol=2, as.table=FALSE) plot(allplots)
The trend over time is mostly unchanged for period with lots of data, and for periods with very small amount of data, the points are allowed to vary as needed.
# delete extra files that were created file.remove("data.txt" ) file.remove("CODAindex.txt" ) file.remove("CODAchain1.txt" ) file.remove("CODAchain2.txt" ) file.remove("CODAchain3.txt" ) file.remove("inits1.txt" ) file.remove("inits2.txt" ) file.remove("inits3.txt" ) file.remove("model.txt" )
The previous example still showed some temporal structure in the $p$'s. This additional structure can be imposed by using a spline for the $p$'s.
Here is an example of some raw data:
demo4.data.csv <- textConnection( 'jweek, n1, m2, u2 9, 0, 0, 4135 10, 1465, 51, 10452 11, 1106, 121, 2199 12, 229, 25, 655 13, 20, 0, 308 14, 177, 17, 719, 15, 702, 74, 973 16, 633, 94, 972 17, 1370, 62, 2386 18, 283, 10, 469 19, 647, 32, 897 20, 276, 11, 426 21, 277, 13, 407 22, 333, 15, 526 23, 3981, 242, 39969 24, 3988, 55, 17580 25, 2889, 115, 7928 26, 3119, 198, 6918 27, 2478, 80, 3578 28, 1292, 71, 1713 29, 2326, 153, 4212 30, 2528, 156, 5037 31, 2338, 275, 3315 32, 1012, 101, 1300 33, 729, 66, 989 34, 333, 44, 444 35, 269, 33, 339 36, 77, 7, 107 37, 62, 9, 79 38, 26, 3, 41 39, 20, 1, 23 40, 4757, 188, 35118 41, 2876, 8, 34534 42, 3989, 81, 14960 43, 1755, 27, 3643 44, 1527, 30, 1811 45, 485, 14, 679 46, 115, 4, 154') demo4.data <- read.csv(demo4.data.csv, header=TRUE, as.is=TRUE, strip.white=TRUE) print(demo4.data)
A preliminary plot of the empirical logit (excluding those weeks when the trap was not running) shows some temporal structure:
demo4.data$elogitphat <- log( (demo4.data$m2+.5)/(demo4.data$n1+1) / (1 - (demo4.data$m2+.5)/(demo4.data$n1+1) )) plotdata <- demo4.data[ demo4.data$u2 >0, ] ggplot(data=plotdata, aes(x=jweek, y=elogitphat))+ ggtitle("Empirical logit vs. julian week ")+ geom_point()+ geom_smooth(se=FALSE )+ ylab("Empirical logit(p)")+xlab("Julian week")
We can create a set of covariates that serve as the basis for the spline over time using the $bs()$ function from the splines package:
library(splines) demo4.logitP.cov <- bs(1:length(demo4.data$n1), df=floor(length(demo4.data$n1)/4), intercept=TRUE) head(demo4.logitP.cov)
The rest of the call is basically the same -- don't forget to specify the additional arguments in the call
library("BTSPAS") # After which weeks is the spline allowed to jump? demo4.jump.after <- c(22,39) # julian weeks after which jump occurs # Which julian weeks have "bad" recapture values. These will be set to 0 or missing prior to the run. demo4.bad.m2 <- c(41) # list julian weeks with bad m2 values. This is used in the Trinity Example demo4.bad.u2 <- c(11) # list julian weeks with bad u2 values. [This was arbitrary to demostrate the feature.] demo4.bad.n1 <- c(38) # list julian weeks with bad n1 values. [This was arbitrary to demonstrate the feature.] # The prefix for the output files: demo4.prefix <- "demo4-JC-2003-CH-TSPDE" # Title for the analysis demo4.title <- "Junction City 2003 Chinook with spline for p " cat("*** Starting ",demo4.title, "\n\n") # Make the call to fit the model and generate the output files demo4.fit <- TimeStratPetersenDiagError_fit( title=demo4.title, prefix=demo4.prefix, time=demo4.data$jweek, n1=demo4.data$n1, m2=demo4.data$m2, u2=demo4.data$u2, jump.after=demo4.jump.after, logitP.cov = demo4.logitP.cov, # ***** NEW ***** bad.n1=demo4.bad.n1, bad.m2=demo4.bad.m2, bad.u2=demo4.bad.u2, InitialSeed=890110, debug=TRUE, # this generates only 10,000 iterations of the MCMC chain for checking. save.output.to.files=FALSE)
# delete extra files that were created file.remove("data.txt" ) file.remove("CODAindex.txt" ) file.remove("CODAchain1.txt" ) file.remove("CODAchain2.txt" ) file.remove("CODAchain3.txt" ) file.remove("inits1.txt" ) file.remove("inits2.txt" ) file.remove("inits3.txt" ) file.remove("model.txt" )
Here is the fitted spline curve to the number of unmarked fish available in each recovery sample.
demo4.fit$plots$fit.plot
The jump in the spline when hatchery fish are released is evident.
The distribution of the posterior sample for the total number unmarked and total abundance is available as before:
demo4.fit$plots$post.UNtot.plot
A plot of the $logit(P)$ is
demo4.fit$plots$logitP.plot
Here is a plot of the estimated $logit(p)$'s with the fitted spline for the $p$'s:
demo4.row.names <- rownames(demo4.fit$summary) demo4.coeff.row.index <- grep("beta.logitP[", demo4.row.names, fixed=TRUE) demo4.coeff.row.index <- demo4.coeff.row.index[1:ncol(demo4.logitP.cov)] demo4.coeff <- demo4.fit$summary[demo4.coeff.row.index,"mean"] demo4.coeff.sd <- demo4.fit$summary[demo4.coeff.row.index, "sd"] demo4.pred.logitP <- demo4.logitP.cov %*% demo4.coeff demo4.logitP.row.index <- grep("^logitP", demo4.row.names) demo4.logitP <- demo4.fit$summary[demo4.logitP.row.index, "mean"] # extract the logit(P) values plotdata <- data.frame(jweek =demo4.data$jweek, pred =demo4.pred.logitP, actual =demo4.logitP, actual.lcl=demo4.fit$summary[demo4.logitP.row.index, "2.5%"], actual.ucl=demo4.fit$summary[demo4.logitP.row.index, "97.5%"]) ggplot(data=plotdata, aes(x=jweek, y=pred))+ ggtitle("Relationship of logit(p) and time with fitted spline")+ geom_line()+ geom_errorbar(aes(ymin=actual.lcl, ymax=actual.ucl), color="blue", width=.01)+ geom_point(aes(y=actual),color="blue")+ ylab("logit(p) and 95% ci")+xlab("Julian week")
The underlying spline smooths the p's somewhat, especially when the credible intervals are very wide (e.g. around julian weeks 37-40).
A summary of the posterior for each parameter is also available. In particular, here are the summary statistics on the posterior sample for the total number unmarked and total abundance:
demo4.fit$summary[ row.names(demo4.fit$summary) %in% c("Ntot","Utot"),]
The estimated total abundance from $BTSPAS$ is
r formatC(round(demo4.fit$summary[ "Ntot","mean"]), big.mark=",", digits=0, format="f")
(SD
r formatC(round(demo4.fit$summary[ "Ntot","sd" ]), big.mark=",", digits=0, format="f")
) fish.
Bonner, S. J., & Schwarz, C. J. (2011). Smoothing population size estimates for Time-Stratified Mark–Recapture experiments Using Bayesian P-Splines. Biometrics, 67, 1498–1507. https://doi.org/10.1111/j.1541-0420.2011.01599.x
Schwarz, C. J., & Dempson, J. B. (1994). Mark-recapture estimation of a salmon smolt population. Biometrics, 50, 98–108.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.