crm: Capture-recapture model fitting function

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/crm.R

Description

Fits user specified models to some types of capture-recapture wholly in R and not with MARK. A single function that processes data, creates the design data, makes the crm model and runs it

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
crm(
  data,
  ddl = NULL,
  begin.time = 1,
  model = "CJS",
  title = "",
  model.parameters = list(),
  design.parameters = list(),
  initial = NULL,
  groups = NULL,
  time.intervals = NULL,
  debug = FALSE,
  method = NULL,
  hessian = FALSE,
  accumulate = TRUE,
  chunk_size = 1e+07,
  control = list(),
  refit = 1,
  itnmax = 5000,
  scale = NULL,
  run = TRUE,
  burnin = 100,
  iter = 1000,
  use.admb = FALSE,
  use.tmb = FALSE,
  crossed = NULL,
  reml = FALSE,
  compile = FALSE,
  extra.args = NULL,
  strata.labels = NULL,
  clean = NULL,
  save.matrices = FALSE,
  simplify = FALSE,
  getreals = FALSE,
  real.ids = NULL,
  check = FALSE,
  prior = FALSE,
  prior.list = NULL,
  useHess = FALSE,
  optimize = TRUE,
  ...
)

Arguments

data

Either the raw data which is a dataframe with at least one column named ch (a character field containing the capture history) or a processed dataframe

ddl

Design data list which contains a list element for each parameter type; if NULL it is created

begin.time

Time of first capture(release) occasion

model

Type of c-r model (eg, "cjs", "js")

title

Optional title; not used at present

model.parameters

List of model parameter specifications

design.parameters

Specification of any grouping variables for design data for each parameter

initial

Optional vector of initial values for beta parameters; if named from previous analysis only relevant values are used

groups

Vector of names factor variables for creating groups

time.intervals

Intervals of time between the capture occasions

debug

if TRUE, shows optimization output

method

optimization method

hessian

if TRUE, computes v-c matrix using hessian

accumulate

if TRUE, like capture-histories are accumulated to reduce computation

chunk_size

specifies amount of memory to use in accumulating capture histories and design matrices; amount used is 8*chunk_size/1e6 MB (default 80MB)

control

control string for optimization functions

refit

non-zero entry to refit

itnmax

maximum number of iterations for optimization

scale

vector of scale values for parameters

run

if TRUE, it runs model; otherwise if FALSE can be used to test model build components

burnin

number of iterations for mcmc burnin; specified default not realistic for actual use

iter

number of iterations after burnin for mcmc (not realistic default)

use.admb

if TRUE uses ADMB for cjs, mscjs or mvms models

use.tmb

if TRUE runs TMB for cjs, mscjs or mvms models

crossed

if TRUE it uses cjs.tpl or cjs_reml.tpl if reml=FALSE or TRUE respectively; if FALSE, then it uses cjsre which can use Gauss-Hermite integration

reml

if TRUE uses restricted maximum likelihood

compile

if TRUE forces re-compilation of tpl file

extra.args

optional character string that is passed to admb if use.admb==TRUE

strata.labels

labels for strata used in capture history; they are converted to numeric in the order listed. Only needed to specify unobserved strata. For any unobserved strata p=0..

clean

if TRUE, deletes the tpl and executable files for amdb if use.admb=T

save.matrices

for HMM models this option controls whether the gamma,dmat and delta matrices are saved in the model object

simplify

if TRUE, design matrix is simplified to unique valus including fixed values

getreals

if TRUE, compute real values and std errors for TMB models; may want to set as FALSE until model selection is complete

real.ids

vector of id values for which real parameters should be output with std error information for TMB models; if NULL all ids used

check

if TRUE values of gamma, dmat and delta are checked to make sure the values are valid with initial parameter values.

prior

if TRUE will expect vectors of prior values in list prior.list; currently only implemented for cjsre_tmb

prior.list

which contains list of prior parameters that will be model dependent

useHess

if TRUE, the TMB hessian function is used for optimization; using hessian is typically slower with many parameters but can result in a better solution

optimize

if TRUE, optimizes to get parameter estimates; set to FALSE to extract estimates of ADREPORTed values only

...

optional arguments passed to js or cjs and optimx

Details

This function is operationally similar to the function mark in RMark in that is is a shell that calls several other functions to perform the following steps: 1) process.data to setup data and parameters and package them into a list (processed data),2) make.design.data to create the design data for each parameter in the specified model, 3) create.dm to create the design matrices for each parameter based on the formula provided for each parameter, 4) call to the specific function for model fitting (now either cjs_admb or js). As with mark the calling arguments for crm are a compilation of the calling arguments for each of the functions it calls (with some arguments renamed to avoid conflicts). expects to find a value for ddl. Likewise, if the data have not been processed, then ddl should be NULL. This dual calling structure allows either a single call approach for each model or alternatively for the data to be processed and the design data (ddl) to be created once and then a whole series of models can be analyzed without repeating those steps.

There are some optional arguments that can be used to set initial values and control other aspects of the optimization. The optimization is done with the R package/function optimx and the arguments method and hessian are described with the help for that function. In addition, any arguments not matching those for cjs_admb (the ...) are passed to optimx allowing any of the other parameters to be set. If you set debug=TRUE, then at each function evaluation (cjs.lnl the current values of the parameters and -2*log-likelihood value are output.

In the current implementation, a logit link is used to constrain the parameters in the unit interval (0,1) except for probability of entry which uses an mlogit and N which uses a log link. For the probitCJS model, a probit link is used for the parameters. These could be generalized to use other link functions. Following the notation of MARK, the parameters in the link space are referred to as beta and those in the actual parameter space of Phi and p as reals.

Initial values can be set in 2 ways. To set a baseline intial value for the intercept of Phi p set those arguments to some real value in the open interval (0,1). All non-intercept beta parameters are set to zero. Alternatively, you can specify in initial, a vector of initial values for the beta parameters (on the logit scale). This is most easily done by passing the results from a previous model run using the result list element beta as described below. The code will match the names of the current design matrix to the names in beta and use the appropriate initial values. Any non-specified values are set to 0. If there are no names associated with the initial vector then they are simply used in the specified order. If you do not specify initial values it is equivalent to setting Phi and p to 0.5.

If you have a study with unequal time intervals between capture occasions, then these can be specified with the argument time.intervals.

The argument accumulate defaults to TRUE. When it is TRUE it will accumulate common capture histories that also have common design and common fixed values (see below) for the parameters. This will speed up the analysis because in the calculation of the likelihood (cjs.lnl it loops over the unique values. In general the default will be the best unless you have many capture histories and are using many individual covariate(s) in the formula that would make each entry unique. In that case there will be no effect of accumulation but the code will still try to accumulate. In that particular case by setting accumulate=FALSE you can skip the code run for accumulation.

Most of the arguments controlling the fitted model are contained in lists in the arguments model.parameters and design.parameters which are similar to their counterparts in mark inb RMark. Each is a named list with the names being the parameters in the model (e.g., Phi and p in "cjs" and "Phi","p","pent","N" in "js"). Each named element is also a list containing various values defining the design data and model for the parameter. The elements of model.parameters can include formula which is an R formula to create the design matrix for the parameter and fixed is a matrix of fixed values as described below. The elements of design.parameters can include time.varying, fields, time.bins,age.bins, and cohort.bins. See create.dmdf for a description of the first 2 and create.dm for a description of the last 3.

Real parameters can be set to fixed values using fixed=x where x is a matrix with 3 columns and any number of rows. The first column specifies the particular animal (capture history) as the row number in the dataframe x. The second specifies the capture occasion number for the real parameter to be fixed. For Phi and pent these are 1 to nocc-1 and for p they are 2 to nocc for "cjs" and 1 to nocc for "js". This difference is due to the parameter labeling by the beginning of the interval for Phi (e.g., survival from occasion 1 to 2) and by the occasion for p. For "cjs" p is not estimated for occasion 1. The third element in the row is the real value in the closed unit interval [0,1] for the fixed parameter. This approach is completely general allowing you to fix a particular real parameter for a specific animal and occasion but it is a bit kludgy. Alternatively, you can set fixed values by specifying values for a field called fix in the design data for a parameter. If the value of fix is NA the parameter is estimated and if it is not NA then the real parameter is fixed at that value. If you also specify fixed as decribed above, they will over-ride any values you have also set with fix in the design data. To set all of the real values for a particular occasion you can use the following example with the dipper data as a template:

model.parameters=list(Phi=list(formula=~1, fixed=cbind(1:nrow(dipper),rep(2,nrow(dipper)),rep(1,nrow(dipper)))))

The above sets Phi to 1 for the interval between occasions 2 and 3 for all animals.

Alternatively, you could do as follows:

data(dipper) dp=process.data(dipper) ddl=make.design.data(dp) ddl$Phi$fix=ifelse(ddl$Phi$time==2,1,NA)

At present there is no modification of the parameter count to address fixing of real parameters except that if by fixing reals, a beta is not needed in the design it will be dropped. For example, if you were to use ~time for Phi with survival fixed to 1 for time 2, then then beta for that time would not be included.

To use ADMB (use.admb=TRUE), you need to install: 1) the R package R2admb, 2) ADMB, and 3) a C++ compiler (I recommend gcc compiler). The following are instructions for installation with Windows. For other operating systems see (http://www.admb-project.org/downloads) and (http://www.admb-project.org/tools/gcc/).

Windows Instructions:

1) In R use install.packages function or choose Packages/Install Packages from menu and select R2admb.

2) Install ADMB 11: http://www.admb-project.org/downloads. Put the software in C:/admb to avoid problems with spaces in directory name and for the function below to work.

3) Install gcc compiler from: http://www.admb-project.org/tools/gcc/. Put in c:/MinGW

I use the following function in R to setup R2admb to access ADMB rather than adding to my path so gcc versions with Rtools don't conflict.

1
2
3
4
5
6
7
prepare_admb=function()
{
  Sys.setenv(PATH = paste("c:/admb/bin;c:admb/utilities;c:/MinGW/bin;", 
        Sys.getenv("PATH"), sep = ";"))
    Sys.setenv(ADMB_HOME = "c:/admb")
    invisible()
}

To use different locations you'll need to change the values used above

Before running crm with use.admb=T, execute the function prepare_admb(). You could put this function or the code it contains in your .First or .Rprofile so it runs each time you start R.

Value

crm model object with class=("crm",submodel) where submodel is either "CJS" or "JS" at present.

Author(s)

Jeff Laake

See Also

cjs_admb, js, make.design.data,process.data

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
{
# cormack-jolly-seber model
# fit 3 cjs models with crm
data(dipper)
dipper.proc=process.data(dipper,model="cjs",begin.time=1)
dipper.ddl=make.design.data(dipper.proc)
mod.Phit.pt=crm(dipper.proc,dipper.ddl,
   model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)))
mod.Phit.pt
mod.Phisex.pdot=crm(dipper.proc,dipper.ddl,groups="sex",
   model.parameters=list(Phi=list(formula=~sex),p=list(formula=~1)))
mod.Phisex.pdot
## if you have RMark installed you can use this code to run the same models 
## by removing the comment symbol
#library(RMark)
#data(dipper)
#mod0=mark(dipper,
#model.parameters=list(Phi=list(formula=~time),p=list(formula=~time)),output=FALSE)
#summary(mod0,brief=TRUE)
#mod1=mark(dipper,
#model.parameters=list(Phi=list(formula=~1),p=list(formula=~1)),output=FALSE)
#summary(mod1,brief=TRUE)
#mod2<-mark(dipper,groups="sex",
#model.parameters=list(Phi=list(formula=~sex),p=list(formula=~1)),output=FALSE)
#summary(mod2,brief=TRUE)
# jolly seber model
crm(dipper,model="js",groups="sex",
   model.parameters=list(pent=list(formula=~sex),N=list(formula=~sex)),accumulate=FALSE)

# This example is excluded from testing to reduce package check time
# if you have RMark installed you can use this code to run the same models 
# by removing the comment 
#data(dipper)
#data(mstrata)
#mark(dipper,model.parameters=list(p=list(formula=~time)),output=FALSE)$results$beta
#mark(mstrata,model="Multistrata",model.parameters=list(p=list(formula=~1),
# S=list(formula=~1),Psi=list(formula=~-1+stratum:tostratum)),
# output=FALSE)$results$beta
#mod=mark(dipper,model="POPAN",groups="sex",
#   model.parameters=list(pent=list(formula=~sex),N=list(formula=~sex)))
#summary(mod)
#CJS example with hmm
crm(dipper,model="hmmCJS",model.parameters = list(p = list(formula = ~time)))
##MSCJS example with hmm
data(mstrata)
ms=process.data(mstrata,model="hmmMSCJS",strata.labels=c("A","B","C"))
ms.ddl=make.design.data(ms)
ms.ddl$Psi$fix=NA
ms.ddl$Psi$fix[ms.ddl$Psi$stratum==ms.ddl$Psi$tostratum]=1
crm(ms,ms.ddl,model.parameters=list(Psi=list(formula=~-1+stratum:tostratum)))

}

Example output

Loading required package: lme4
Loading required package: Matrix
Loading required package: parallel
This is marked 1.1.13

Warning message:
In checkMatrixPackageVersion() : Package version inconsistency detected.
TMB was built with Matrix version 1.2.10
Current Matrix version is 1.2.11
Please re-install 'TMB' from source or restore original 'Matrix' package
255 capture histories collapsed into 53
Fitting model

Computing initial parameter estimates

Starting optimization for 12 parameters...

 Number of evaluations:  100  -2lnl: 657.9114231
 Number of evaluations:  200  -2lnl: 657.0242112
 Number of evaluations:  300  -2lnl: 656.9585639
 Number of evaluations:  400  -2lnl:  656.950212
 Number of evaluations:  500  -2lnl: 657.2868122
 Number of evaluations:  600  -2lnl: 656.9532409
 Number of evaluations:  700  -2lnl: 656.9719899
 Number of evaluations:  800  -2lnl: 656.9543594
 Number of evaluations:  900  -2lnl: 656.9670165
 Number of evaluations:  1000  -2lnl: 656.9507391
 Number of evaluations:  1100  -2lnl:  656.950212
Elapsed time in minutes:  0.0143 
Fitting model

Computing initial parameter estimates

Starting optimization for 3 parameters...

 Number of evaluations:  100  -2lnl: 667.1588615
Elapsed time in minutes:  0.0228 
Model: JS

Processing data...

Creating design data...

Fitting model

Computing initial parameter estimates

Starting optimization 6  parameters

 Number of evaluations:  100  -2lnl: -552.4664113
 Number of evaluations:  200  -2lnl: -555.0266476
 Number of evaluations:  300  -2lnl: -555.0192503
 Number of evaluations:  400  -2lnl:  -554.99496
Elapsed time in minutes:  0.0271 
Model: HMMCJS

Processing data...

255 capture histories collapsed into 53
Creating design data...

Fitting model


Elapsed time in minutes:  0.0139 
252 capture histories collapsed into 252
Fitting model


Elapsed time in minutes:  0.0346 

crm Model Summary

Npar :  8
-2lnL:  30443.75
AIC  :  30459.75

Beta
                           Estimate
S.(Intercept)            0.79096825
p.(Intercept)            0.02070916
Psi.stratumB:tostratumA -1.10486961
Psi.stratumC:tostratumA -1.09728716
Psi.stratumA:tostratumB -1.10590085
Psi.stratumC:tostratumB -1.09706174
Psi.stratumA:tostratumC -1.10771956
Psi.stratumB:tostratumC -1.10665151
Warning messages:
1: In max(logpar) : no non-missing arguments to max; returning -Inf
2: In min(logpar) : no non-missing arguments to min; returning Inf
3: In max(logpar) : no non-missing arguments to max; returning -Inf
4: In min(logpar) : no non-missing arguments to min; returning Inf

marked documentation built on Dec. 9, 2019, 9:06 a.m.