View source: R/do.bootstrap.cress.R
do.bootstrap.cress | R Documentation |
This fuction performs a specified number of bootstrapping iterations using CReSS/SALSA for fitting the second stage count model. See below for details.
do.bootstrap.cress(
orig.data,
predict.data,
ddf.obj = NULL,
model.obj,
splineParams,
g2k,
resample = "transect.id",
rename = "segment.id",
stratum = NULL,
B,
name = NULL,
save.data = FALSE,
nhats = FALSE,
seed = 12345,
nCores = 1
)
orig.data |
The original data. In case |
predict.data |
The prediction grid data |
ddf.obj |
The ddf object created for the best fitting detection model. Defaults to |
model.obj |
The best fitting |
splineParams |
The object describing the parameters for fitting the one and two dimensional splines |
g2k |
(N x k) matrix of distances between all prediction points (N) and all knot points (k) |
resample |
Specifies the resampling unit for bootstrapping, default is |
rename |
A vector of column names for which a new column needs to be created for the bootstrapped data. This defaults to |
stratum |
The column name in |
B |
Number of bootstrap iterations |
name |
Analysis name. Required to avoid overwriting previous bootstrap results. This name is added at the beginning of "predictionboot.RData" when saving bootstrap predictions. |
save.data |
If TRUE, all created bootstrap data will be saved as an RData object in the working directory at each iteration, defaults to FALSE |
nhats |
(default = FALSE). If you have calculated bootstrap NHATS because there is no simple ddf object then a matrix of these may be fed into the function. The number of columns of data should >= B. The rows must be equal to those in |
seed |
Set the seed for the bootstrap sampling process. |
nCores |
Set the number of computer cores for the bootstrap process to use (default = 1). The more cores the faster the proces but be wary of over using the cores on your computer. If |
In case of distance sampling data, the following steps are performed for each iteration:
the original data is bootstrapped
a detection function is fitted to the bootstrapped data
a count model is fitted to the bootstrapped data
coefficients are resampled from a multivariate normal distribution defined by MLE and COV from count model
predictions are made to the prediction data using the resampled coefficients
In case of count data, the following steps are performed for each iteration:
coefficients are resampled from a multivariate normal distribution defined by MLE and COV from the best fitting count model
predictions are made to the prediction data using the resampled coefficients
The function returns a matrix of bootstrap predictions. The number of rows is equal to the number of rows in predict.data. The number of columns is equal to B
. The matrix may be very large and so is stored directly into the working directory as a workspace object: '"name"predictionboot.RObj'. The object inside is called bootPreds
.
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# offshore redistribution data
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
data(dis.data.re)
data(predict.data.re)
data(knotgrid.off)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# distance sampling
dis.data.re$survey.id<-paste(dis.data.re$season,dis.data.re$impact,sep="")
result<-ddf(dsmodel=~mcds(key="hn", formula=~1), data=dis.data.re, method="ds",
meta.data=list(width=250))
dis.data.re<-create.NHAT(dis.data.re,result)
count.data<-create.count.data(dis.data.re)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# spatial modelling
splineParams<-makesplineParams(data=count.data, varlist=c('depth'))
#set some input info for SALSA
count.data$response<- count.data$NHAT
# make distance matrices for datatoknots and knottoknots
distMats<-makeDists(cbind(count.data$x.pos, count.data$y.pos), na.omit(knotgrid.off))
# choose sequence of radii
r_seq<-getRadiiChoices(8,distMats$dataDist)
# set initial model without the spatial term
initialModel<- glm(response ~ as.factor(season) + as.factor(impact) + offset(log(area)),
family='quasipoisson', data=count.data)
# make parameter set for running salsa2d
salsa2dlist<-list(fitnessMeasure = 'QICb', knotgrid = knotgrid.off,
knotdim=c(26,14), startKnots=4, minKnots=4,
maxKnots=20, r_seq=r_seq, gap=4000, interactionTerm="as.factor(impact)")
salsa2dOutput_k6<-runSALSA2D(initialModel, salsa2dlist, d2k=distMats$dataDist,
k2k=distMats$knotDist, splineParams=splineParams)
splineParams<-salsa2dOutput_k6$splineParams
# specify parameters for local radial function:
radiusIndices <- splineParams[[1]]$radiusIndices
dists <- splineParams[[1]]$dist
radii <- splineParams[[1]]$radii
aR <- splineParams[[1]]$invInd[splineParams[[1]]$knotPos]
count.data$blockid<-paste(count.data$transect.id, count.data$season, count.data$impact, sep='')
# Re-fit the chosen model as a GEE (based on SALSA knot placement) and GEE p-values
geeModel<- geeglm(formula(salsa2dOutput_k6$bestModel), data=count.data, family=poisson, id=blockid)
dists<-makeDists(cbind(predict.data.re$x.pos, predict.data.re$y.pos), na.omit(knotgrid.off),
knotmat=FALSE)$dataDist
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# bootstrapping
do.bootstrap.cress(dis.data.re, predict.data.re, ddf.obj=result, geeModel, splineParams,
g2k=dists, resample='transect.id', rename='segment.id', stratum='survey.id',
B=4, name="cress", save.data=FALSE, nhats=FALSE, nCores=1)
load("cresspredictionboot.RData") # loading the bootstrap predictions into the workspace
# look at the first 6 lines of the bootstrap predictions (on the scale of the response)
head(bootPreds)
## Not run:
# In parallel (Note: windows machines only)
require(parallel)
do.bootstrap.cress(dis.data.re, predict.data.re, ddf.obj=result, geeModel, splineParams,
g2k=dists, resample='transect.id', rename='segment.id', stratum='survey.id',
B=4, name="cress", save.data=FALSE, nhats=FALSE, nCores=4)
load("cresspredictionboot.RData") # loading the bootstrap predictions into the workspace
# look at the first 6 lines of the bootstrap predictions (on the scale of the response)
head(bootPreds)
## End(Not run)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# nearshore redistribution data
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
## Not run:
do.bootstrap.cress(ns.data.re, ns.predict.data.re, ddf.obj=NULL, geeModel, splineParams,
g2k=dists, resample='transect.id', rename='segment.id', stratum=NULL,
B=2, name="cress", save.data=FALSE, nhats=FALSE)
load("cresspredictionboot.RData") # loading the predictions into the workspace
# look at the first 6 lines of the bootstrap predictions (on the scale of the response)
head(bootPreds)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.