mark.wrapper.parallel: Constructs and runs in parallel a set of MARK models from a...

View source: R/mark.wrapper.parallel.r

mark.wrapper.parallelR Documentation

Constructs and runs in parallel a set of MARK models from a dataframe of parameter specifications

Description

This is a convenience function that uses a dataframe of parameter specifications created by create.model.list and it constructs and runs each model and names the models by concatenating each of the parameter specification names separated by a period. The results are returned as a marklist with a model.table constructed by collect.models.

Usage

mark.wrapper.parallel(
  model.list,
  silent = FALSE,
  use.initial = FALSE,
  initial = NULL,
  parallel = TRUE,
  cpus = 2,
  threads = 1,
  ...
)

Arguments

model.list

a dataframe of parameter specification names in the calling frame

silent

if TRUE, errors that are encountered are suppressed

use.initial

if TRUE, initial values are constructed for new models using completed models that have already been run in the set

initial

vector, mark model or marklist for defining initial values

parallel

if TRUE, runs models in parallel on multiple cpus

cpus

number of cpus to use in parallel

threads

number of cpus to use with mark.exe if positive or number of cpus to remain idle if negative

...

arguments to be passed to mark. These must be specified as argument=value pairs.

Details

The model names in model.list must be in the frame of the function that calls run.models. If model.list=NULL or the MARK models are collected from the frame of the calling function (the parent). If type is specified only the models of that type (e.g., "CJS") are run. In each case the models are run and saved in the parent frame. To fully understand, how this function works and its limitations, see create.model.list.

If use.initial=TRUE, prior to running a model it looks for the first model that has already been run (if any) for each parameter formula and constructs an initial vector from that previous run. For example, if you provided 5 models for p and 3 for Phi in a CJS model, as soon as the first model for p is run, in the subsequent 2 models with different Phi models, the initial values for p are assigned based on the run with the first Phi model. At the outset this seemed like a good idea to speed up execution times, but from the one set of examples I ran where several parameters were at boundaries, the results were discouraging because the models converged to a sub-optimal likelihood value than the runs using the default initial values. I've left sthis option in but set its default value to FALSE.

A possibly more useful argument is the argument initial. Previously, you could use initial=model as part of the ... arguments and it would use the estimates from that model to assign initial values for any model in the set. Now I've defined initial as a specific argument and it can be used as above or you can also use it to specify a marklist of previously run models. When you do that, the code will lookup each new model to be run in the set of models specified by initial and if it finds one with the matching name then it will use the estimates for any matching parameters as initial values in the same way as initial=model does. The model name is based on concatenating the names of each of the parameter specification objects. To make this useful, you'll want to adapt to an approach that I've started to use of naming the objects something like p.1,p.2 etc rather than naming them something like p.dot, p.time as done in many of the examples. I've found that using numeric approach is much less typing and cumbersome rather than trying to reflect the formula in the name. By default, the formula is shown in the model selection results table, so it was a bit redundant. Now where I see this being the most benefit. Individual covariate models tend to run rather slowly. So one approach is to run the sequence of models (eg results stored in initial_marklist), including the set of formulas with all of the variables other than individual covariates. Then run another set with the same numbering scheme, but adding the individual covariates to the formula and using initial=initial_marklist That will work if each parameter specification has the same name (eg., p.1=list(formula=~time) and then p.1=list(formula=~time+an_indiv_covariate)). All of the initial values will be assigned for the previous run except for any added parameters (eg. an_indiv_covariate) which will start with a 0 initial value.

Value

marklist - list of mark models that were run and a model.table of results

Author(s)

Eldar Rakhimberdiev

See Also

collect.models, mark, create.model.list

Examples


# example not run to reduce time required for checking
do.MSOccupancy=function()
{
#  Get the data
	data(NicholsMSOccupancy)
# Define the models; default of Psi1=~1 and Psi2=~1 is assumed
# p varies by time but p1t=p2t
	p1.p2equal.by.time=list(formula=~time,share=TRUE)
# time-varying p1t and p2t
	p1.p2.different.time=list(p1=list(formula=~time,share=FALSE),p2=list(formula=~time))
#  delta2 model with one rate for times 1-2 and another for times 3-5;
# delta2 defined below
	Delta.delta2=list(formula=~delta2)
	Delta.dot=list(formula=~1)  # constant delta
	Delta.time=list(formula=~time) # time-varying delta
# Process the data for the MSOccupancy model
	NicholsMS.proc=process.data(NicholsMSOccupancy,model="MSOccupancy")
# Create the default design data
	NicholsMS.ddl=make.design.data(NicholsMS.proc)
# Add a field for the Delta design data called delta2.  It is a factor variable
# with 2 levels: times 1-2, and times 3-5.
	NicholsMS.ddl=add.design.data(NicholsMS.proc,NicholsMS.ddl,"Delta",
			type="time",bins=c(0,2,5),name="delta2")
# Create a list using the 4 p modls and 3 delta models (12 models total)
	cml=create.model.list("MSOccupancy")
# Fit each model in the list and return the results
	return(mark.wrapper.parallel(cml,data=NicholsMS.proc,ddl=NicholsMS.ddl,
    cpus=2,parallel=TRUE,delete=TRUE))
}
xx=do.MSOccupancy()


RMark documentation built on Aug. 14, 2022, 1:05 a.m.