specPOUMM: Specifying a POUMM fit

Description Usage Arguments Value Functions

Description

Specification and validation of POUMM/PMM settings.

Usage

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
specifyPOUMM(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE,
  validateSpec = TRUE
)

specifyPOUMM_ATS(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE,
  sigmaeFixed = 0
)

specifyPOUMM_ATSG0(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE,
  sigmaeFixed = 0
)

specifyPOUMM_ATSSeG0(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

specifyPMM(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

specifyPMM_SSeG0(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

specifyPOUMM_ATH2tMeanSe(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

specifyPOUMM_ATH2tMeanSeG0(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

specifyPMM_H2tMeanSe(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

specifyPMM_H2tMeanSeG0(
  z = NULL,
  tree = NULL,
  zMin = -10,
  zMean = 0,
  zMax = 10,
  zVar = 4,
  zSD = sqrt(zVar),
  tMin = 0.1,
  tMean = 2,
  tMax = 10,
  parMapping = NULL,
  parLower = NULL,
  parUpper = NULL,
  g0Prior = NULL,
  parInitML = NULL,
  control = NULL,
  parPriorMCMC = NULL,
  parInitMCMC = NULL,
  parScaleMCMC = NULL,
  nSamplesMCMC = 1e+05,
  nAdaptMCMC = nSamplesMCMC,
  thinMCMC = 100,
  accRateMCMC = 0.01,
  gammaMCMC = 0.50001,
  nChainsMCMC = 3,
  samplePriorMCMC = TRUE,
  parallelMCMC = FALSE
)

Arguments

z, tree

a numeric vector and a phylo object on which the fit is to be done. These arguments are used in order to guess meaningful values for the parLower, parUpper and parPriorMCMC arguments. See also, zMin,zMean,...,tMax below.

zMin, zMean, zMax, zVar, zSD, tMin, tMean, tMax

summary statistics of the observed tip-values (z) and root-tip distances (t). Some of these values are used for constructing default parameter values and limits; These arguments are given default values which will most likely be meaningless in your specific use-case. The default values will be overwritten with the corresponding statistics from the z and tree arguments if these were specified. If none of tree and z, nor these parameters are specified, then the arguments parLower, parUpper, parPriorMCMC must be specified explicitly.

parMapping

An R-function that can handle, both, a numeric vector or a numeric matrix as argument. This function should transform the input vector or each row-vector (if the input is matrix) into a (row-)vector of the POUMM parameters alpha, theta, sigma, sigmae, g0. For a vector input the function should return a vector with named elements alpha, theta, sigma, sigmae, g0. For a matrix input the function should return a matrix with the same number of rows and columns alpha, theta, sigma, sigmae, g0. Only finite non-negative values are allowed for alpha, sigma, and sigmae. Returning Inf, -Inf, NA or NaN for any of these parameters will result in an error during likelihood calculation. Only finite numerical values are allowed for theta. The parameter g0 is treated in a special way and can assume either a finite numerical value or one of NA or NaN. If g0 = finite value, this value is used together with the corresponding values of alpha, theta, sigma, and sigmae for likelihood calcuation. If g0 = NA (meaing value Not Avaiable), the value of g0 is calculated analytically during likelihood calculation in order to maximise one of the following:

  1. if a normal prior for g0 was specified (see g0Prior), pdf(z | α, θ, σ, σ_e, g0, tree) x prior(g0).

  2. otherwise, pdf(z | α, θ, σ, σ_e, g0, tree).

If g0 = NaN (meaning Not a Number), then the likelihood is marginalized w.r.t. the g0's prior distribution (see g0Prior), i.e. the likelihood returned is: pdf(z | α, θ, σ, σ_e, tree) = Integral(pdf(z|α,θ,σ,σ_e,g0) x pdf(g0) d g0; g0 from -∞ to +∞) In this case (g0=NaN), if g0Prior is not specified, it is assumed that g0Prior is the stationary OU normal distribution with mean, theta, and variance, varOU(Inf, alpha, sigma).
Examples:

 
 # Default for POUMM: identity for alpha, theta, sigma, sigmae, NA for g0.
 parMapping = function(par) {
   if(is.matrix(par)) {
     atsseg0 <- cbind(par[, 1:4, drop = FALSE], NA) 
     colnames(atsseg0) <- c("alpha", "theta", "sigma", "sigmae", "g0")
   } else {
     atsseg0 <- c(par[1:4], NA) 
     names(atsseg0) <- c("alpha", "theta", "sigma", "sigmae", "g0")
   }
   atsseg0
 }
parLower, parUpper

two named numeric vectors of the same length indicating the boundaries of the search region for the ML-fit. Calling parMapping on parLower and parUpper should result in appropriate values of the POUMM parameters alpha, theta, sigma sigmae and g0. By default, the upper limit for alpha is set to 69.31 / tMean, which corresponds to a value of alpha so big that the time for half-way convergence towards theta from any initial trait value is 100 times shorter than the mean root-tip distance in the tree. Examples:

# Default for POUMM:
parLower = c(alpha = 0, theta = zMin - 2 * (zMax - zMin), sigma = 0, sigmae = 0)
parUpper = c(alpha = 69.31 / tMean, theta = zMax + 2 * (zMax - zMin), 
             sigma = sigmaOU(H2 = .99, alpha = 69.31 / tMean, sigmae = 2 * zSD,
                                    t = tMean), 
             sigmae = 2 * zSD)
g0Prior

Either NULL or a list with named numeric or character members "mean" and "var". Specifies a prior normal distribution for the parameter g0. If characters, the members "mean" and "var" are evaluated as R-expressions - useful if these are functions of some of other parameters. Note that if g0Prior is not NULL and g0 is not NaN (either a fixed number or NA), then the likelihood maximization takes into account the prior for g0, that is, the optimization is done over the product p(g0) x lik(data|g0, other parameters and tree). This can be helpful to prevent extremely big or low estimates of g0. To avoid this behavior and always maximize the likelihood, use g0Prior = NULL.

parInitML

A named vector (like parLower and parUpper) or a list of such vectors - starting points for optim.

control

List of parameters passed on to optim in the ML-fit, default list(factr=1e9), see ?optim.

parPriorMCMC

A function of a numeric parameter-vector returning the log-prior for this parameter vector. Example:

# Default for POUMM:
 parPriorMCMC = function(par) {
   dexp(par[1], rate = tMean / 6.931, TRUE) + 
     dnorm(par[2], zMean, 10 * zSD, TRUE) +
     dexp(par[3],  rate = sqrt(tMean / (zVar * 0.6931)), TRUE) + 
     dexp(par[4], rate = 2 / zSD, TRUE)
 }
parInitMCMC

a function(chainNo, fitML) returning an initial state of an MCMC as a vector. The argument fitML can be used to specify an initial state, close to a previously found likelihood optimum. Example:

 
 # Default for POUMM:
 parInitMCMC = function(chainNo, fitML) {
   if(!is.null(fitML)) {
     parML <- fitML$par
   } else {
     parML <- NULL
   }
   
   init <- rbind(
     c(alpha = 0, theta = 0, sigma = 1, sigmae = 0),
     parML,
     c(alpha = 0, theta = 0, sigma = 1, sigmae = 1)
   )
   
   init[(chainNo - 1) %% nrow(init) + 1, ]
 }
parScaleMCMC

Numeric matrix indicating the initial jump-distribution matrix for the MCMC fit. Default for POUMM is diag(4);

nSamplesMCMC

Integer indicating the length of each MCMC chain. Defaults to 1e5.

nAdaptMCMC

Logical indicating whether adaptation of the MCMC jump distribution should be done with respect to the target acceptance rate (accRateMCMC) or integer indicating how many initial MCMC iterations should be used for adaptation of the jump-distribution matrix (see details in ?POUMM). Defaults to nSamplesMCMC meaning continuous adaptation throughout the MCMC.

thinMCMC

Integer indicating the thinning interval of the mcmc-chains. Defaults to 100.

accRateMCMC

numeric between 0 and 1 indicating the target acceptance rate of the adaptive Metropolis sampling (see details in ?POUMM). Default 0.01.

gammaMCMC

controls the speed of adaption. Should be in the interval (0.5,1]. A lower gamma leads to faster adaption. Default value is 0.50001.

nChainsMCMC

integer indicating the number of chains to run. Defaults to 3 chains, from which the first one is a sample from the prior distribution (see samplePriorMCMC).

samplePriorMCMC

Logical indicating if sampling from the prior should be done for the first chain (see nChainsMCMC). This is useful to compare mcmc's for an overlap between prior and posterior distributions. Default is TRUE.

parallelMCMC

Logical indicating whether the MCMC chains should be run in parallel. Setting this option to TRUE results in using foreach::foreach() %dopar% { } construct for the MCMC fit. In order for parallel execution to be done, you should create a computing cluster and register it as parallel back-end (see example in package vignette and the web-page https://github.com/tobigithub/R-parallel/wiki/R-parallel-Setups).

validateSpec

Logical indicating whether the passed parameters should be validated. This parameter is used internally and should always be TRUE.

sigmaeFixed

fixed value for the sigmae parameter (used in specifyPOUMM_ATS and specifyPOUMM_ATSG0).

Value

A named list to be passed as a spec argument to POUMM.

Functions


POUMM documentation built on Oct. 27, 2020, 5:06 p.m.