Compute risk ratio and uncertainty based on generalized extreme value model fit to block maxima or minima

Share:

Description

Compute risk ratio and uncertainty by fitting generalized extreme value model, designed specifically for climate data, to exceedance-only data, using the point process approach. The risk ratio is the ratio of the probability of exceedance of a pre-specified value under the model fit to the first dataset to the probability under the model fit to the second dataset. Default standard errors are based on the usual MLE asymptotics using a delta-method-based approximation, but standard errors based on the nonparametric bootstrap and on a likelihood ratio procedure can also be computed.

Usage

1
2
3
4
5
6
7
8
9
calc_riskRatio_gev(returnValue, y1, y2, x1 = NULL, x2 = x1,
  locationFun1 = NULL, locationFun2 = locationFun1, scaleFun1 = NULL,
  scaleFun2 = scaleFun1, shapeFun1 = NULL, shapeFun2 = shapeFun1,
  nReplicates1 = 1, nReplicates2 = 1, replicateIndex1 = NULL,
  replicateIndex2 = NULL, weights1 = NULL, weights2 = NULL,
  xNew1 = NULL, xNew2 = NULL, maxes = TRUE, scaling1 = 1,
  scaling2 = 1, ciLevel = 0.9, bootSE = FALSE, bootControl = list(seed =
  0, n = 250, by = "block"), lrtCI = FALSE, lrtControl = list(bounds =
  c(0.01, 100)), optimArgs = list(method = "Nelder-Mead"))

Arguments

returnValue

numeric value giving the value for which the risk ratio should be calculated, where the resulting period will be the average number of blocks until the value is exceeded and the probability the probability of exceeding the value in any single block.

y1

a numeric vector of observed maxima or minima values for the first dataset. See Details for how the values of y1 should be ordered if there are multiple replicates and the values of x1 are identical for all replicates.

y2

a numeric vector of observed maxima or minima values for the second dataset. Analogous to y1.

x1

a data frame, or object that can be converted to a data frame with columns corresponding to covariate/predictor/feature variables and each row containing the values of the variable for the corresponding observed maximum/minimum. The number of rows should either equal the length of y1 or (if there is more than one replicate) it can optionally equal the number of observations in a single replicate, in which case the values will be assumed to be the same for all replicates.

x2

analogous to x1 but for the second dataset

locationFun1

formula, vector of character strings, or indices describing a linear model (i.e., regression function) for the location parameter using columns from x1 for the first dataset. x1 must be supplied if this is anything other than NULL or ~1.

locationFun2

formula, vector of character strings, or indices describing a linear model (i.e., regression function) for the location parameter using columns from x2 for the second dataset. x2 must be supplied if this is anything other than NULL or ~1.

scaleFun1

formula, vector of character strings, or indices describing a linear model (i.e., regression function) for the log of the scale parameter using columns from x1 for the first dataset. x1 must be supplied if this is anything other than NULL or ~1.

scaleFun2

formula, vector of character strings, or indices describing a linear model (i.e., regression function) for the log of the scale parameter using columns from x2 for the second dataset. x2 must be supplied if this is anything other than NULL or ~1.

shapeFun1

formula, vector of character strings, or indices describing a linear model (i.e., regression function) for the shape parameter using columns from x1 for the first dataset. x1 must be supplied if this is anything other than NULL or ~1.

shapeFun2

formula, vector of character strings, or indices describing a linear model (i.e., regression function) for the shape parameter using columns from x2 for the first dataset. x2 must be supplied if this is anything other than NULL or ~1.

nReplicates1

numeric value indicating the number of replicates for the first dataset.

nReplicates2

numeric value indicating the number of replicates for the second dataset.

replicateIndex1

numeric vector providing the index of the replicate corresponding to each element of y1. Used (and therefore required) only when using bootstrapping with the resampling by replicates based on the by element of bootControl.

replicateIndex2

numeric vector providing the index of the replicate corresponding to each element of y2. Analogous to replicateIndex1.

weights1

a vector providing the weights for each observation in the first dataset. When there is only one replicate or the weights do not vary by replicate, a vector of length equal to the number of observations. When weights vary by replicate, this should be of equal length to y. Likelihood contribution of each observation is multiplied by the corresponding weight.

weights2

a vector providing the weights for each observation in the second dataset. Analogous to weights1.

xNew1

object of the same form as x1, providing covariate/predictor/feature values for which one desires log risk ratios.

xNew2

object of the same form as x2, providing covariate/predictor/feature values for which log risk ratios are desired. Must provide the same number of covariate sets as xNew1 as the risk ratio is based on contrasting return probabilities under xNew1 and xNew2.

maxes

logical indicating whether analysis is for block maxima (TRUE) or block minima (FALSE); in the latter case, the function works with the negative of the values, changing the sign of the resulting location parameters

scaling1

positive-valued scalar used to scale the data values of the first dataset for more robust optimization performance. When multiplied by the values, it should produce values with magnitude around 1.

scaling2

positive-valued scalar used to scale the data values of the second dataset for more robust optimization performance. When multiplied by the values, it should produce values with magnitude around 1.

ciLevel

statistical confidence level for confidence intervals; in repeated experimentation, this proportion of confidence intervals should contain the true risk ratio. Note that if only one endpoint of the resulting interval is used, for example the lower bound, then the effective confidence level increases by half of one minus ciLevel. For example, a two-sided 0.90 confidence interval corresponds to a one-sided 0.95 confidence interval.

bootSE

logical indicating whether to use the bootstrap to estimate standard errors.

bootControl

a list of control parameters for the bootstrapping. See Details.

lrtCI

logical indicating whether to calculate a likelihood ratio-based confidence interval

lrtControl

list containing a single component, bounds, which sets the range inside which the algorithm searches for the endpoints of the likelihood ratio-based confidence interval. This avoids numerical issues with endpoints converging to zero and infinity. If an endpoint is not found within the interval, it is set to NA.

optimArgs

a list with named components matching exactly any arguments that the user wishes to pass to optim. See help(optim) for details. Of particular note, 'method' can be used to choose the optimization method used for maximizing the log-likelihood to fit the model and 'control=list(maxit=VALUE)' for a user-chosen VALUE can be used to increase the number of iterations if the optimization is converging slowly.

Details

See fit_gev for more details on fitting the block maxima model for each dataset, including details on blocking and replication. Also see fit_gev for information on the bootControl argument.

Author(s)

Christopher J. Paciorek

References

Jeon S., C.J. Paciorek, and M.F. Wehner. 2016. Quantile-based bias correction and uncertainty quantification of extreme event attribution statements. Weather and Climate Extremes. In press. arXiv preprint: http://arxiv.org/abs/1602.04139.

Examples

1
# need examples