Description Details Storing of Monte-Carlo simulations References See Also Examples

Allows fitting of step-functions to univariate serial data where neither the number of jumps nor their positions is known by implementing the multiscale regression estimators SMUCE (Frick et al., 2014) and HSMUCE (Pein et al., 2017). In addition, confidence intervals for the change-point locations and bands for the unknown signal can be obtained. This is implemented in the function `stepFit`

. Moreover, technical quantities like the statistics itself, bounds or critical values can be computed by other functions of the package. More details can be found in the vignette.

New in version 2.0-0:

`stepFit` | Piecewise constant multiscale inference |

`critVal` | Critical values |

`computeBounds` | Computation of the bounds |

`computeStat` | Computation of the multiscale statistic |

`monteCarloSimulation` | Monte Carlo simulation |

`parametricFamily` | Parametric families |

`intervalSystem` | Interval systems |

`penalty` | Penalties |

From version 1.0-0:

`compareBlocks` | Compare fit blockwise with ground truth |

`neighbours` | Neighbouring integers |

`sdrobnorm` | Robust standard deviation estimate |

`stepblock` | Step function |

`stepcand` | Forward selection of candidate jumps |

`stepfit` | Fitted step function |

`steppath` | Solution path of step-functions |

`stepsel` | Automatic selection of number of jumps |

Mainly used for patchclamp recordings and may be transferred to a specialised package:

`BesselPolynomial` | Bessel Polynomials |

`contMC` | Continuous time Markov chain |

`dfilter` | Digital filters |

`jsmurf` | Reconstruct filtered piecewise constant functions with noise |

`transit` | TRANSIT algorithm for detecting jumps |

Deprecated (please read the documentation of them theirself for more details):

`MRC` | Compute Multiresolution Criterion |

`MRC.1000` | Values of the MRC statistic for 1,000 observations (all intervals) |

`MRC.asymptotic` | "Asymptotic" values of the MRC statistic (all intervals) |

`MRC.asymptotic.dyadic` | "Asymptotic" values of the MRC statistic(dyadic intervals) |

`bounds` | Bounds based on MRC |

`family` | Family of distributions |

`smuceR` | Piecewise constant regression with SMUCE |

If `q == NULL`

in `critVal`

, `stepFit`

or `computeBounds`

a Monte-Carlo simulation is required for computing critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, this package offers multiple possibilities for saving and loading the simulations. Simulations can either be saved in the workspace in the variable `critValStepRTab`

or persistently on the file system for which the package `R.cache`

is used. Moreover, storing in and loading from variables and RDS files is supported. Finally, a pre-simulated collection of simulations can be accessed by installing the package `stepRdata`

available from http://www.stochastik.math.uni-goettingen.de/stepRdata_1.0-0.tar.gz. By default simulations will be saved in the workspace and on the file system. For more details and for how simulation can be removed see Section Simulating, saving and loading of Monte-Carlo simulations in `critVal`

.

Frick, K., Munk, A., Sieling, H. (2014) Multiscale change-point inference. With discussion and rejoinder by the authors. *Journal of the Royal Statistical Society, Series B* **76**(3), 495–580.

Pein, F., Sieling, H., Munk, A. (2017) Heterogeneous change point inference. *Journal of the Royal Statistical Society, Series B*, **79**(4), 1207–1227.

Pein, F., Tecuapetla-Gómez, I., Schütte, O., Steinem, C., Munk, A. (2017) Fully-automatic multiresolution idealization for filtered ion channel recordings: flickering event detection. *arXiv*:1706.03671.

Hotz, T., Schütte, O., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013) Idealizing ion channel recordings by a jump segmentation multiresolution filter. *IEEE Transactions on NanoBioscience* **12**(4), 376–386.

VanDongen, A. M. J. (1996) A new algorithm for idealizing single ion channel data containing multiple unknown conductance levels. *Biophysical Journal* **70**(3), 1303–1315.

Futschik, A., Hotz, T., Munk, A., Sieling, H. (2014) Multiresolution DNA partitioning: statistical evidence for segments. *Bioinformatics*, **30**(16), 2255–2262.

Boysen, L., Kempe, A., Liebscher, V., Munk, A., Wittich, O. (2009) Consistencies and rates of convergence of jump-penalized least squares estimators. *The Annals of Statistics* **37**(1), 157–183.

Davies, P. L., Kovac, A. (2001) Local extremes, runs, strings and multiresolution. *The Annals of Statistics* **29**, 1–65.

Friedrich, F., Kempe, A., Liebscher, V., Winkler, G. (2008) Complexity penalized M-estimation: fast computation. *Journal of Computational and Graphical Statistics* **17**(1), 201–224.

`stepFit`

, `critVal`

, `computeStat`

, `computeBounds`

, `jsmurf`

, `transit`

, `sdrobnorm`

, `compareBlocks`

, parametricFamily, intervalSystem, penalty

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 | ```
# generate random observations
set.seed(1)
n <- 100L
x <- seq(1 / n, 1, 1 / n)
mu <- stepfit(cost = 0, family = "gauss", value = c(0, 3, 0, -2, 0), param = NULL,
leftEnd = x[c(1, 21, 26, 71, 81)],
rightEnd = x[c(20, 25, 70, 80, 100)], x0 = 0,
leftIndex = c(1, 21, 26, 71, 81),
rightIndex = c(20, 25, 70, 80, 100))
sigma0 <- 0.5
epsilon <- rnorm(n, 0, sigma0)
y <- fitted(mu) + epsilon
plot(x, y, pch = 16, col = "grey30", ylim = c(-3, 4))
lines(mu, lwd = 3)
# computation of SMUCE and its confidence statements
fit <- stepFit(y, x = x, alpha = 0.5, jumpint = TRUE, confband = TRUE)
lines(fit, lwd = 3, col = "red", lty = "22")
# confidence intervals for the change-point locations
points(jumpint(fit), col = "red", lwd = 3)
# confidence band
lines(confband(fit), lty = "22", col = "darkred", lwd = 2)
# higher significance level for larger detection power, but less confidence
# suggested for screening purposes
stepFit(y, x = x, alpha = 0.9, jumpint = TRUE, confband = TRUE)
# smaller significance level for the small risk that the number of
# change-points is overestimated with probability not more than 5%,
# but smaller detection power
stepFit(y, x = x, alpha = 0.05, jumpint = TRUE, confband = TRUE)
# different interval system, lengths, penalty and given parameter sd
stepFit(y, x = x, alpha = 0.5, intervalSystem = "dyaLen",
lengths = c(1L, 2L, 4L, 8L), penalty = "weights",
weights = c(0.4, 0.3, 0.2, 0.1), sd = sigma0,
jumpint = TRUE, confband = TRUE)
# the above calls saved and (attempted to) load Monte-Carlo simulations and
# simulated them for nq = 128 observations
# in the following call no saving, no loading and simulation for n = 100
# observations is required, progress of the simulation will be reported
stepFit(y, x = x, alpha = 0.5, jumpint = TRUE, confband = TRUE, messages = 1000L,
options = list(simulation = "vector", load = list(), save = list()))
# critVal was called in stepFit, can be called explicitly,
# for instance outside of a for loop to save computation time
qVector <- critVal(100L, alpha = 0.5)
identical(stepFit(y, x = x, q = qVector, jumpint = TRUE, confband = TRUE), fit)
qValue <- critVal(100L, alpha = 0.5, output = "value")
identical(stepFit(y, x = x, q = qValue, jumpint = TRUE, confband = TRUE), fit)
# computeBounds gives the multiscale contraint
computeBounds(y, alpha = 0.5)
# monteCarloSimulation will be called in critVal if required
# can be called explicitly
stat <- monteCarloSimulation(n = 100L)
identical(critVal(n = 100L, alpha = 0.5, stat = stat),
critVal(n = 100L, alpha = 0.5,
options = list(load = list(), simulation = "vector")))
identical(critVal(n = 100L, alpha = 0.5, stat = stat, output = "value"),
critVal(n = 100L, alpha = 0.5, output = "value",
options = list(load = list(), simulation = "vector")))
stat <- monteCarloSimulation(n = 100L, output = "maximum")
identical(critVal(n = 100L, alpha = 0.5, stat = stat),
critVal(n = 100L, alpha = 0.5,
options = list(load = list(), simulation = "vector")))
identical(critVal(n = 100L, alpha = 0.5, stat = stat, output = "value"),
critVal(n = 100L, alpha = 0.5, output = "value",
options = list(load = list(), simulation = "vector")))
# fit satisfies the multiscale contraint, i.e.
# the computed penalized multiscale statistic is not larger than the global quantile
computeStat(y, signal = fit, output = "maximum") <= qValue
# multiscale vector of statistics is componentwise not larger than
# the vector of critical values
all(computeStat(y, signal = fit, output = "vector") <= qVector)
# family "hsmuce"
set.seed(1)
y <- c(rnorm(50, 0, 1), rnorm(50, 1, 0.2))
plot(x, y, pch = 16, col = "grey30", ylim = c(-2.5, 2))
# computation of HSMUCE and its confidence statements
fit <- stepFit(y, x = x, alpha = 0.5, family = "hsmuce",
jumpint = TRUE, confband = TRUE)
lines(fit, lwd = 3, col = "red", lty = "22")
# confidence intervals for the change-point locations
points(jumpint(fit), col = "red", lwd = 3)
# confidence band
lines(confband(fit), lty = "22", col = "darkred", lwd = 2)
# for comparison SMUCE, not recommend to use here
lines(stepFit(y, x = x, alpha = 0.5,
jumpint = TRUE, confband = TRUE),
lwd = 3, col = "blue", lty = "22")
# family "mDependentPS"
# generate observations from a moving average process
set.seed(1)
y <- c(rep(0, 50), rep(2, 50)) +
as.numeric(arima.sim(n = 100, list(ar = c(), ma = c(0.8, 0.5, 0.3)), sd = sigma0))
correlations <- as.numeric(ARMAacf(ar = c(), ma = c(0.8, 0.5, 0.3), lag.max = 3))
covariances <- sigma0^2 * correlations
plot(x, y, pch = 16, col = "grey30", ylim = c(-2, 4))
# computation of SMUCE for dependent observations with given covariances
fit <- stepFit(y, x = x, alpha = 0.5, family = "mDependentPS",
covariances = covariances, jumpint = TRUE, confband = TRUE)
lines(fit, lwd = 3, col = "red", lty = "22")
# confidence intervals for the change-point locations
points(jumpint(fit), col = "red", lwd = 3)
# confidence band
lines(confband(fit), lty = "22", col = "darkred", lwd = 2)
# for comparison SMUCE for independent observations, not recommend to use here
lines(stepFit(y, x = x, alpha = 0.5,
jumpint = TRUE, confband = TRUE),
lwd = 3, col = "blue", lty = "22")
# with given correlations, standard deviation will be estimated by sdrobnorm
stepFit(y, x = x, alpha = 0.5, family = "mDependentPS",
correlations = correlations, jumpint = TRUE, confband = TRUE)
# examples from version 1.0-0
# estimating step-functions with Gaussian white noise added
# simulate a Gaussian hidden Markov model of length 1000 with 2 states
# with identical transition rates 0.01, and signal-to-noise ratio 2
sim <- contMC(1e3, 0:1, matrix(c(0, 0.01, 0.01, 0), 2), param=1/2)
plot(sim$data, cex = 0.1)
lines(sim$cont, col="red")
# maximum-likelihood estimation under multiresolution constraints
fit.MRC <- smuceR(sim$data$y, sim$data$x)
lines(fit.MRC, col="blue")
# choose number of jumps using BIC
path <- steppath(sim$data$y, sim$data$x, max.blocks=1e2)
fit.BIC <- path[[stepsel.BIC(path)]]
lines(fit.BIC, col="green3", lty = 2)
# estimate after filtering
# simulate filtered ion channel recording with two states
set.seed(9)
# sampling rate 10 kHz
sampling <- 1e4
# tenfold oversampling
over <- 10
# 1 kHz 4-pole Bessel-filter, adjusted for oversampling
cutoff <- 1e3
df.over <- dfilter("bessel", list(pole=4, cutoff=cutoff / sampling / over))
# two states, leaving state 1 at 10 Hz, state 2 at 20 Hz
rates <- rbind(c(0, 10), c(20, 0))
# simulate 0.5 s, level 0 corresponds to state 1, level 1 to state 2
# noise level is 0.3 after filtering
Sim <- contMC(0.5 * sampling, 0:1, rates, sampling=sampling, family="gaussKern",
param = list(df=df.over, over=over, sd=0.3))
plot(Sim$data, pch = ".")
lines(Sim$discr, col = "red")
# fit under multiresolution constraints using filter corresponding to sample rate
df <- dfilter("bessel", list(pole=4, cutoff=cutoff / sampling))
Fit.MRC <- jsmurf(Sim$data$y, Sim$data$x, param=df, r=1e2)
lines(Fit.MRC, col = "blue")
# fit using TRANSIT
Fit.trans <- transit(Sim$data$y, Sim$data$x)
lines(Fit.trans, col = "green3", lty=2)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.