computeBounds: Computation of the bounds
In stepR: Multiscale Change-Point Inference

computeBounds

R Documentation

Computation of the bounds

Description

Computes the multiscale contraint given by the multiscale test, (3.12) in the vignette. In more detail, returns the bounds of the interval of parameters for which the test statistic is smaller than or equal to the critical value for the corresponding length, i.e. the two solutions resulting from equating the test statistic to the critical value.
If q == NULL a Monte-Carlo simulation is required for computing critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, this package saves them by default in the workspace and on the file system such that a second call requiring the same Monte-Carlo simulation will be much faster. For more details, in particular to which arguments the Monte-Carlo simulations are specific, see Section Storing of Monte-Carlo simulations below. Progress of a Monte-Carlo simulation can be reported by the argument messages and the saving can be controlled by the argument option, both can be specified in ... and are explained in monteCarloSimulation and critVal, respectively.

Usage

computeBounds(y, q = NULL, alpha = NULL, family = NULL,
              intervalSystem = NULL, lengths = NULL, ...)

Arguments

`y`	a numeric vector containing the observations
`q`	either `NULL`, then the vector of critical values at level `alpha` will be computed from a Monte-Carlo simulation, or a numeric giving the global quantile or a numeric vector giving the vector of critical values. Either `q` or `alpha` must be given. Otherwise, `alpha == 0.5` is chosen with a `warning`. This argument will be passed to `critVal` to obtain the needed critical values. Additional parameters for the computation of `q` can be specified in `...`, for more details see the documentation of `critVal`. Please note that by default the Monte-Carlo simulation will be saved in the workspace and on the file system, for more details see Section Storing of Monte-Carlo simulations below
`alpha`	a probability, i.e. a single numeric between 0 and 1, giving the significance level. Its choice is a trade-off between data fit and parsimony of the estimator. In other words, this argument balances the risks of missing change-points and detecting additional artefacts. For more details on this choice see (Frick et al., 2014, section 4) and (Pein et al., 2017, section 3.4). Either `q` or `alpha` must be given. Otherwise, `alpha == 0.5` is chosen with a `warning`
`family`	a string specifying the assumed parametric family, for more details see parametricFamily, currently `"gauss"`, `"hsmuce"` and `"mDependentPS"` are supported. By default (`NULL`) `"gauss"` is assumed
`intervalSystem`	a string giving the used interval system, either `"all"` for all intervals, `"dyaLen"` for all intervals of dyadic length or `"dyaPar"` for the dyadic partition, for more details see intervalSystem. By default (`NULL`) the default interval system of the specified parametric family will be used, which one this will be is described in `parametricFamily`
`lengths`	an integer vector giving the set of lengths, i.e. only intervals of these lengths will be considered. Note that not all lengths are possible for all interval systems and for all parametric families, see intervalSystem and parametricFamily, respectively, to see which ones are allowed. By default (`NULL`) all lengths that are possible for the specified `intervalSystem` and for the specified parametric family will be used
`...`	there are two groups of further arguments: further parameters of the parametric family. Depending on argument `family` some might be required, but others might be optional, please see parametricFamily for more details, further parameters that will be passed to `critVal`. `critVal` will be called automatically with the number of observations `n = length(y)`, the arguments `family`, `intervalSystem`, `lengths`, `q` and `output` set. For these arguments no user interaction is required and possible, all other arguments of `critVal` can be passed additionally

Value

A data.frame containing two integer vectors li and ri and two numeric vectors lower and upper. For each interval in the set of intervals specified by intervalSystem and lengths li and ri give the left and right index of the interval and lower and upper give the lower and upper bounds for the parameter on the given interval.

Storing of Monte-Carlo simulations

If q == NULL a Monte-Carlo simulation is required for computing critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, this package offers multiple possibilities for saving and loading the simulations. Progress of a simulation can be reported by the argument messages which can be specified in ... and is explained in the documentation of monteCarloSimulation. Each Monte-Carlo simulation is specific to the number of observations, the parametric family (including certain parameters, see parametricFamily) and the interval system, and for simulations of class "MCSimulationMaximum", additionally, to the set of lengths and the used penalty. Monte-Carlo simulations can also be performed for a (slightly) larger number of observations n_q given in the argument nq in ... and explained in the documentation of critVal, which avoids extensive resimulations for only a little bit varying number of observations. Simulations can either be saved in the workspace in the variable critValStepRTab or persistently on the file system for which the package R.cache is used. Moreover, storing in and loading from variables and RDS files is supported. Finally, a pre-simulated collection of simulations can be accessed by installing the package stepRdata available from http://www.stochastik.math.uni-goettingen.de/stepRdata_1.0-0.tar.gz. The simulation, saving and loading can be controlled by the argument option which can be specified in ... and is explained in the documentation of critVal. By default simulations will be saved in the workspace and on the file system. For more details and for how simulation can be removed see Section Simulating, saving and loading of Monte-Carlo simulations in critVal.

Note

Depending on intervalSystem and lengths the intervals might be ordered differently to allow fast computation. For most applications the order should not matter. Otherwise, the entries can be reordered with order, an example is given below.

References

Frick, K., Munk, A., Sieling, H. (2014) Multiscale change-point inference. With discussion and rejoinder by the authors. Journal of the Royal Statistical Society, Series B 76(3), 495–580.

Pein, F., Sieling, H., Munk, A. (2017) Heterogeneous change point inference. Journal of the Royal Statistical Society, Series B, 79(4), 1207–1227.

Examples



y <- c(rnorm(50), rnorm(50, 1))

# the multiscale contraint
bounds <- computeBounds(y, alpha = 0.5)

# the order of the bounds depends on intervalSystem and lengths
# to allow fast computation
# if a specific order is required it can be reordered by order
# b is ordered with increasing left indices and increasing right indices
b <- bounds[order(bounds$li, bounds$ri), ]
attr(b, "row.names") <- seq(along = b$li)

# higher significance level for larger detection power, but less confidence
computeBounds(y, alpha = 0.99)

# smaller significance level for stronger confidence statements, but at
# the risk of missing change-points
computeBounds(y, alpha = 0.05)

# different interval system, lengths, penalty and given parameter sd
computeBounds(y, alpha = 0.5, intervalSystem = "dyaLen",
              lengths = c(1L, 2L, 4L, 8L), penalty = "weights",
              weights = c(0.4, 0.3, 0.2, 0.1), sd = 0.5)
              
# with given q
identical(computeBounds(y, q = critVal(100L, alpha = 0.5)), bounds)
identical(computeBounds(y, q = critVal(100L, alpha = 0.5, output = "value")),
          bounds)

# the above calls saved and (attempted to) load Monte-Carlo simulations and
# simulated them for nq = 128 observations
# in the following call no saving, no loading and simulation for n = 100
# observations is required, progress of the simulation will be reported
computeBounds(y, alpha = 0.5, messages = 1000L,
              options = list(simulation = "vector",
                             load = list(), save = list()))
                             
# with given stat to compute q
stat <- monteCarloSimulation(n = 128L)
identical(computeBounds(y, alpha = 0.5, stat = stat),
          computeBounds(y, alpha = 0.5, options = list(load = list())))

stepR documentation built on Oct. 18, 2024, 5:10 p.m.