bootComb: Combine parameter estimates via bootstrap

Description Usage Arguments Value See Also Examples

View source: R/bootComb.R

Description

This package propagates uncertainty from several estimates when combining these estimates via a function. It does this by using the parametric bootstrap to simulate values from the distribution of each estimate to build up an empirical distribution of the combined parameter. Finally either the percentile method is used or the highest density interval is chosen to derive a confidence interval for the combined parameter with the desired coverage.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
bootComb(
  distList,
  combFun,
  N = 1e+06,
  distributions = NULL,
  qLowVect = NULL,
  qUppVect = NULL,
  alphaVect = 0.05,
  Sigma = NULL,
  method = "quantile",
  coverage = 0.95,
  doPlot = FALSE,
  legPos = "topright",
  returnBootVals = FALSE,
  validRange = NULL,
  seed = NULL
)

Arguments

distList

If Sigma is set to NULL, this is a list object where each element of the list is a sampling function for a probability distribution function (i.e. like rnorm, rbeta, ...). If Sigma is specified, then this needs to be a list of quantile functions for the distributions for each parameter.

combFun

The function to combine the different estimates to a new parameter. Needs to take a single list as input argument, one element of the list for each estimate. This list input argument needs to be a list of same length as distList.

N

The number of bootstrap samples to take. Defaults to 1e6.

distributions

Alternatively to specifying distlist, the parameters distributions, qLowVect, qUppVect and (optionally) alphaVect can be specified. The first 3 of these need t be either all specified and be vectors of the same length or all set to NULL. The distributions parameter needs to be a vector specifying the names of the distributions for each parameter (one of "beta", "exponential", "gamma", "normal", "Poisson" or "NegativeBinomial").

qLowVect

Alternatively to specifying distlist, the parameters distributions, qLowVect, qUppVect and (optionally) alphaVect can be specified. The first 3 of these need t be either all specified and be vectors of the same length or all set to NULL. The qLowVect parameter needs to be a vector specifying the lower confidence interval limits for each parameter.

qUppVect

Alternatively to specifying distlist, the parameters distributions, qLowVect, qUppVect and (optionally) alphaVect can be specified. The first 3 of these need t be either all specified and be vectors of the same length or all set to NULL. The qUppVect parameter needs to be a vector specifying the upper confidence interval limits for each parameter.

alphaVect

Alternatively to specifying distlist, the parameters distributions, qLowVect, qUppVect and (optionally) alphaVect can be specified. The first 3 of these need t be either all specified and be vectors of the same length or all set to NULL. The alphaVect parameter needs to be a vector specifying the alpha level (i.e. 1 minus the coverage) of each confidence interval. Can be specified as a single number if the same for all parameters. Defaults to 0.05.

Sigma

Set to NULL if parameters are assumed to be independent (the default). If specified, this needs to be a valid covariance matrix for a multivariate normal distribution with variances equal to 1 for all variables (in other words, this really is a correlation matrix).

method

The method uses to derive a confidence interval from the empirical distribution of the combined parameter.Needs to be one of 'quantile' (default; uses the percentile method to derive the confidence interval) or hdi' (computes the highest density interval).

coverage

The desired coverage of the resulting confidence interval.Defaults to 0.95.

doPlot

Logical; indicates whether a graph should be produced showing the input distributions and the resulting empirical distribution of the combined estimate together with the reported confidence interval. Defaults to FALSE.

legPos

Legend position (only used if doPlot==TRUE); either NULL (no legend) or one of "top", "topleft", "topright", "bottom", "bottomleft", "bottomright" "left", "right", "center".

returnBootVals

Logical; if TRUE then the parameter values computed from the bootstrapped input parameter values will be returned; values for the individual parameters will be reported as a second list element; defaults to FALSE.

validRange

Optional; if not NULL, a vector of length 2 giving the range within which the values obtained from the bootstrapped input parameters must lie; values outside this range will be discarded. Behaviour that results in the need for this option arises when parameters are not independent. Use with caution.

seed

If desired a random seed can be specified so that the same results can be reproduced.

Value

A list with 3 elements:

conf.int

A vector of length 2 giving the lower and upper limits of the computed confidence interval.

bootstrapValues

A vector containing the computed / combined parameter values from the bootstrap samples of the input parameters. (Only non-NULL if returnBootVals is set to TRUE.)

bootstrapValuesInput

A list where each element is the vector of the bootstrapped values for the corresponding input parameter. This can be useful to check the dependence structure that was specified. (Only non-NULL if returnBootVals is set to TRUE.)

See Also

hdi

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
## Example 1 - product of 2 probability parameters for which only the 95% CIs are reported
dist1<-getBetaFromCI(qLow=0.4,qUpp=0.6,alpha=0.05)
dist2<-getBetaFromCI(qLow=0.7,qUpp=0.9,alpha=0.05)
distListEx<-list(dist1$r,dist2$r)
combFunEx<-function(pars){pars[[1]]*pars[[2]]}
bootComb(distList=distListEx,
         combFun=combFunEx,
         doPlot=TRUE,
         method="hdi",
         N=1e5, # reduced from N=1e6 so that it runs quicker; larger values => more accurate
         seed=352)

# Alternatively, the same example can be run in just 2 lines of code:
combFunEx<-function(pars){pars[[1]]*pars[[2]]}
bootComb(distributions=c("beta","beta"),
         qLowVect=c(0.4,0.7),
         qUppVect=c(0.6,0.9),
         combFun=combFunEx,
         doPlot=TRUE,
         method="hdi",
         N=1e5, # reduced from N=1e6 so that it runs quicker; larger values => more accurate
         seed=352)

## Example 2 - sum of 3 Gaussian distributions
dist1<-function(n){rnorm(n,mean=5,sd=3)}
dist2<-function(n){rnorm(n,mean=2,sd=2)}
dist3<-function(n){rnorm(n,mean=1,sd=0.5)}
distListEx<-list(dist1,dist2,dist3)
combFunEx<-function(pars){pars[[1]]+pars[[2]]+pars[[3]]}
bootComb(distList=distListEx,combFun=combFunEx,doPlot=TRUE,method="quantile")

# Compare with theoretical result:
exactCI<-qnorm(c(0.025,0.975),mean=5+2+1,sd=sqrt(3^2+2^2+0.5^2))
print(exactCI)
x<-seq(-10,30,length=1e3)
y<-dnorm(x,mean=5+2+1,sd=sqrt(3^2+2^2+0.5^2))
lines(x,y,col="red")
abline(v=exactCI[1],col="red",lty=3)
abline(v=exactCI[2],col="red",lty=3)

## Example 3 - same as Example 1 but assuming the 2 parameters to be dependent / correlated
combFunEx<-function(pars){pars[[1]]*pars[[2]]}
bootComb(distributions=c("beta","beta"),
         qLowVect=c(0.4,0.7),
         qUppVect=c(0.6,0.9),
         Sigma=matrix(byrow=TRUE,ncol=2,c(1,0.5,0.5,1)),
         combFun=combFunEx,
         doPlot=TRUE,
         method="hdi",
         N=1e5, # reduced from N=1e6 so that it runs quicker; larger values => more accurate
         seed=352)

bootComb documentation built on Jan. 31, 2022, 1:07 a.m.