Some introductory comments about the vignette
library(densgrad)
Data come from three sources:
real <- read.csv(file="../data/GPSpoints.txt", header=TRUE, sep="\t")
Make some adjustments to field names
names(real) <- c("infid","nearfid","dist","reserve","veg","animal")
Actual distance sampling survey data
survey <- read.csv(file="../data/nonstratified_farrar.txt", header=TRUE, sep="\t")
Transect 23 had no kangaroo detections, hence in the data the perpendicular distance is recorded as NA
. That record provides no information about the detection function and needs to be removed for purposes of parameter estimation (note the effort associated with transects without detections does need to be retained). Similarly, the truncation distance is set here and the survey data edited accordingly.
survey.truncation <- 150 gradient.max.distance <- 100 survey.data <- survey$Distance[!is.na(survey$Distance) & survey$Distance<survey.truncation]
Apparently this optimisation is not really necessary, but perhaps used for plotting?
gradient <- optim(par=c(35,1.2), fn=likdensity, distances=real$dist, truncation=gradient.max.distance, control=list(fnscale=-1))#,method="L-BFGS-B",lower=c(20,-100),upper=c(200,100)) if (gradient$convergence == 0) { sigma.est <- gradient$par[1] beta.est <- gradient$par[2] likelihood <- gradient$value } else { print("Convergence not achieved, no parameter estimates produced") }
Let's see a plot of the fitted animal distribution with respect to transects (Fig. \@ref(fig:gradfig)).
xs <- seq(from=0, to=gradient.max.distance) par(mfrow=c(1,1),mar=c(4,4,0.5,0.5)) hist(real$dist,prob=T,main="",xlab="Distance (m)", nc=20) lines(xs,pi.x(xs,sigma.est,beta.est,gradient.max.distance),type="l",lty=2,lwd=2)
First the half normal detection function
hn.det <- optim(par=c(50,35,35), fn=full.lik.fx.HN, detectdists=survey.data, gpsdists=real$dist, truncdet=survey.truncation, truncden=gradient.max.distance, control=list(fnscale=-1))#,method="L-BFGS-B",lower=c(20,-100),upper=c(200,100)) if (hn.det$convergence == 0) { sigma.detect.est <- hn.det$par[1] sigma.gradient.est <- hn.det$par[2] beta.est <- hn.det$par[3] likelihood.hn <- hn.det$value aic.hn <- -2 * likelihood.hn + 2 * length(hn.det$par) } else { print("Convergence not achieved, no parameter estimates produced") }
Next the hazard rate detection function
hz.det <- optim(par=c(50,1.2,35,35), fn=full.lik.fx.HR, detectdists=survey.data, gpsdists=real$dist, truncdet=survey.truncation, truncden=gradient.max.distance, control=list(fnscale=-1))#,method="L-BFGS-B",lower=c(20,-100),upper=c(200,100)) if (hz.det$convergence == 0) { sigma.detect.est <- hz.det$par[1] beta.detect.est <- hz.det$par[2] sigma.gradient.est <- hz.det$par[3] beta.gradient.est <- hz.det$par[4] likelihood.hz <- hz.det$value aic.hz <- -2 * likelihood.hz + 2 * length(hz.det$par) } else { print("Convergence not achieved, no parameter estimates produced") }
The AIC for the half normal detection model is r format(aic.hn,scientific=FALSE, nsmall=0)
while the corresponding value for the hazard rate detection model is r format(aic.hz,scientific=FALSE, nsmall=0)
.
Estimated parameters from both models:
hn.estimates <- t(c(aic.hn, hn.det$par[1], NA, hn.det$par[2:3])) hz.estimates <- t(c(aic.hz, hz.det$par)) param.table <- as.data.frame(rbind(hn.estimates, hz.estimates)) rownames(param.table) <- c("Half normal", "Hazard rate") colnames(param.table) <- c("AIC","$\\widehat{\\sigma_D}$","$\\widehat{\\beta_D}$", "$\\widehat{\\tau_G}$","$\\widehat{\\beta_G}$") knitr::kable(param.table, digits=1, caption="Parameter estimates for two joint models.")
Given the (marginally) smaller AIC for the half normal detection function, Fig. \@ref(fig:HN) shows the fit of the half normal detection function to the observed detection distances.
hist(survey$Distance[survey$Distance<survey.truncation], main="",yaxt="n",ylab="g(y)",xlab="Distance (m)") xs <- seq(from=0, to=survey.truncation) lines(xs,60*detfn(z=xs,pars=hn.det$par[1],key="HN",adjn=0,w=survey.truncation), ylab="g(y)",lwd=2,type="l",lty=3) axis(2,at=seq(0,60,length=5),labels=seq(0,1,length=5))
Integration of the selected detection function over all perpendicular distances gives an estimate of detection probability ($\hat{P}$) used to adjust number of detections to account for imperfect detectability.
Phat <- estimate.Phat("HN", hn.det$par, survey.truncation)
To estimate density in the covered region, the classic formula is applied, $$ \hat{D} = \frac{n \cdot E[s]}{2wL\hat{P}} $$
using probability of detection derived from the best function that incorporates a density gradient.
In addition, three statistics from the survey are required:
with truncation distance (w) earlier specified by the user. Note the argument to the function below is the survey data frame, including transects on which no detections were made. This is necessary such that effort (L) is calculated correctly.
survey.design <- survey.design.summary(survey, survey.truncation) n <- survey.design$n L <- survey.design$L Es <- survey.design$Es
Perpendicular distances as well as survey effort were recorded in meters. However, the density estimate in the original paper was reported as numbers ha^-2^. The conversion factor of 10000 converts density from m^-2^ to ha^-2^. Estimated density is then
Dhat <- (n * Es)/(2*survey.truncation*L*Phat)*10000
Esadj <- size.bias("HN", survey.data, survey, real$dist, survey.truncation, gradient.max.distance) Dadj <- (n * Esadj)/(2*survey.truncation*L*Phat)*10000
out.table <- data.frame(n=n, Es=Es, Esadj=Esadj, w=survey.truncation, L=L, P=Phat, D=Dhat, Dadj=Dadj) colnames(out.table) <- c("n","E(s)","$E(s)_{sb}$","w","L", "$\\hat{P}$","$\\hat{D}$","$\\hat{D_{sb}}$") knitr::kable(out.table, caption="Summary statistics and estimate of density from selected model.", digits = 3)
A bootstrap approach is used for variance estimation and a histogram of the estimated densities is shown in Fig. \@ref(fig:bootD).
bounds <- varbootstrap("HN", survey, real, survey.truncation, gradient.max.distance, Nboot=999, plotit=TRUE, alpha=0.05) knitr::kable(data.frame(LCI=signif(unname(bounds$CIbounds[1]),4), UCI=signif(unname(bounds$CIbounds[2]),4)))
Because details of each simulation replicate is saved and returned as the first object in the named list, analysts can examine the sampling distribution of all the parameters estimated by the joint likelihood: $\widehat{\sigma_D}$, $\widehat{\beta_D}$ (if HR), $\widehat{\sigma_G}$, $\widehat{\beta_G}$ as well as the derived parameters $\hat{P}$ and $\hat{D}$ (Fig. \@ref(fig:others)).
alpha <- 0.05 alphaover2 <- alpha/2 CIlimits <- c(alphaover2, 1-alphaover2) par(mfrow=c(2,2)) hist(bounds$par.estimates$sigma1, main=expression(widehat(sigma[D])), xlab="Estimated value") abline(v=quantile(bounds$par.estimates$sigma1, probs = CIlimits), lwd=2, lty=3) hist(bounds$par.estimates$sigma2, main=expression(widehat(sigma[G])), xlab="Estimated value") abline(v=quantile(bounds$par.estimates$sigma2, probs = CIlimits), lwd=2, lty=3) hist(bounds$par.estimates$beta, main=expression(widehat(beta[G])), xlab="Estimated value") abline(v=quantile(bounds$par.estimates$beta, probs = CIlimits), lwd=2, lty=3) hist(bounds$par.estimates$P, main=expression(widehat(P)), xlab="Estimated value") abline(v=quantile(bounds$par.estimates$P, probs = CIlimits), lwd=2, lty=3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.