hilde: HILDE

Description Usage Arguments Value Storing of Monte-Carlo simulations References See Also Examples

View source: R/hilde.R

Description

Implements the Heterogeneous Idealization by Local testing and DEconvolution (HILDE) filter (Pein et al., 2020). This non-parametric (model-free) segmentation method combines statistical multiresolution techniques with local deconvolution for idealising patch clamp (ion channel) recordings. It is able to idealize short events (flickering) and allows for heterogeneous noise, but is rather slow. Hence, we recommend to use jsmurf or jules instead if they are suitable as well. Please see the arguments family and method as well as the examples for how to access the function correctly depending on whether homogeneous is assumed or heterogeneous noise is allowed. hilde is a combination of jsmurf (with locationCorrection == "none") and improveSmallScales. Further details about how to decide whether the noise is homogeneous or heterogeneous and whether events are short, and hence which method is suitable, are given in the accompanying vignette.
If q1 == NULL or q2 == NULL a Monte-Carlo simulation is required for computing the critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, this package saves them by default in the workspace and on the file system such that a second call requiring the same Monte-Carlo simulation will be much faster. For more details, in particular to which arguments the Monte-Carlo simulations are specific, see Section Storing of Monte-Carlo simulations below. Progress of a Monte-Carlo simulation can be reported by the argument messages and the saving can be controlled by the argument option, both can be specified in ... and are explained in getCritVal.

Usage

1
2
3
4
5
hilde(data, filter, family = c("hjsmurf", "hjsmurfSPS", "hjsmurfLR",
                               "jsmurf", "jsmurfPS", "jsmurfLR"),
      method = c("2Param", "LR"), q1 = NULL, alpha1 = 0.01, q2 = NULL, alpha2 = 0.04, 
      sd = NULL, startTime = 0,
      output = c("onlyIdealization", "eachStep", "everything"), ...)

Arguments

data

a numeric vector containing the recorded data points

filter

an object of class lowpassFilter giving the used analogue lowpass filter

family

the parametric family used in the jsmurf step; "jsmurf", "jsmurfPS" and "jsmurfLR" assume homogeneous noise and "hjsmurf", "hjsmurfSPS" and "hjsmurfLR" allow for heterogeneous noise. By default, we recommend to use "jsmurfPS" when homogeneous noise is assumed and "hjsmurf" when heterogeneous noise is allowed, see examples. "jsmurf" is the standard statistic from (Hotz et al., 2013), "jsmurfPS" is a slightly more powerful partial sum statistic, "jsmurfLR" is a likelihood-ratio statistic, which is even more powerful but slow. "hjsmurf" is the standard statistic for heterogeneous noise which estimates the variance locally, "hjsmurfSPS" is a studentized partial sum statistic and "hjsmurfLR" is a likelihood ratio statistic, which is more powerful, but very slow

method

the testing method for short events in the improveSmallScales step; "2Param" allows for heterogeneous noise, "LR" assumes homogeneous noise

q1

will be passed to the argument q in jsmurf; by default chosen automatically by getCritVal, for families "jsmurf", "jsmurfPS" and "jsmurfLR" a single numeric, for families "hjsmurf", "hjsmurfSPS" and "hjsmurfLR" a numeric vector giving scale dependent critical values

alpha1

will be passed to the argument alpha in jsmurf; a probability, i.e. a single numeric between 0 and 1, giving the significance level to compute q1 (if q1 == NULL), see getCritVal. Its choice is a trade-off between data fit and parsimony of the estimator. In other words, this argument balances the risks of missing conductance changes and detecting additional artefacts

q2

will be passed to the argument q in improveSmallScales; a numeric vector of the same length as lengths giving critical value for the tests for short events, by default chosen automatically by getCritVal

alpha2

will be passed to the argument alpha in improveSmallScales; a probability, i.e. a single numeric between 0 and 1, giving the significance level to compute the critical value (if q2 == NULL), see getCritVal. Its choice balances the risks of missing short events and detecting additional artefacts

sd

a single positive numeric giving the standard deviation (noise level) sigma0 of the data points before filtering, by default (NULL) estimated by sdrobnorm with lag = filter$len + 1L. For families "hjsmurf", "hjsmurfSPS" and "hjsmurfLR" this argument is ignored with a warning

startTime

a single numeric giving the time at which recording (sampling) of data started, sampling time points will be assumed to be startTime + seq(along = data) / filter$sr

output

a string specifying the return type, see Value

...

additional parameters to be passed to getCritVal or improveSmallScales:

  1. getCritVal will be called automatically (if q1 == NULL or q2 == NULL), the number of data points n = length(data) will be set, the argument family will be assigned and alpha and filter will be passed. For these parameter no user interaction is required and possible, all other parameters of getCritVal can be passed additionally. Note that the same arguments will be passed twice if q1 and q2 have to be computed. If this is not suitable, getCritVal can be called instead

  2. improveSmallScales will be called automatically, the by jsmurf computed fit will be passed to fit and data, filter, method, q = q2, alpha = alpha2, startTime will be passed and output will be set accordingly to the output argument. For these parameter no user interaction is required and possible, all other parameters of deconvolveLocally can be passed additionally

Value

The idealisation (estimation, regression) obtained by HILDE. If output == "onlyIdealization" an object object of class stepblock containing the idealisation. If output == "eachStep" a list containing the entries idealization with the idealisation, fit with the fit by jsmurf, q1 and q2 with the given / computed critical values, filter with the given filter and for families "jsmurf", "jsmurfPS" and "jsmurfLR" sd with the given / estimated standard deviation. If output == "everything" a list containing the entries idealization with a list containing the idealisation after each refining step in the local deconvolution, fit with the fit by jsmurf, q1 and q2 with the given / computed critical values, filter with the given filter and for families "jsmurf", "jsmurfPS" and "jsmurfLR" sd with the given / estimated standard deviation. Additionally, in all cases, the idealisation has an attribute "noDeconvolution", an integer vector, that gives the segments for which no deconvolution could be performed, since two short segments followed each other, see also details in improveSmallScales.

Storing of Monte-Carlo simulations

If q1 == NULL or q2 == NULL a Monte-Carlo simulation is required to compute the critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, multiple possibilities for saving and loading the simulations are offered. Progress of a simulation can be reported by the argument messages which can be specified in ... and is explained in the documentation of getCritVal. Each Monte-Carlo simulation is specific to the parametric family / specified testing method, the number of observations and the used filter. Simulations related to computing q2 are also specific to the arguments thresholdLongSegment, localValue and localVar. Currently, storing such a Monte-Carlo simulation is only possible for their default values. Note, that also Monte-Carlo simulations for a (slightly) larger number of observations nq, given in the argument nq in ... and explained in the documentation of getCritVal, can be used, which avoids extensive resimulations for only a little bit varying number of observations, but results in a (small) loss of power. However, simulations of type "vectorIncreased" (only possible for q1 and families "jsmurf", "jsmurfPS" and "jsmurfLR") or "matrixIncreased", i.e. objects of classes "MCSimulationMaximum" and "MCSimulationVector" with nq observations, have to be resimulated if as.integer(log2(n1)) != as.integer(log2(n2)) when the saved simulation was computed with n == n1 and the simulation now is required for n == n2 and nq >= n1 and nq >= n2. Simulations can either be saved in the workspace in the variable critValStepRTab or persistently on the file system for which the package R.cache is used. Moreover, storing in and loading from variables and RDS files is supported. The simulation, saving and loading can be controlled by the argument option which can be specified in ... and is explained in the documentation of getCritVal. By default simulations will be saved in the workspace and on the file system. For more details and for how simulation can be removed see Section Simulating, saving and loading of Monte-Carlo simulations in getCritVal.

References

Pein, F., Bartsch, A., Steinem, C., Munk, A. (2020) Heterogeneous Idealization of Ion Channel Recordings - Open Channel Noise. arXiv:2008.02658.

Hotz, T., Schütte, O., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013) Idealizing ion channel recordings by a jump segmentation multiresolution filter. IEEE Transactions on NanoBioscience 12(4), 376–386.

See Also

getCritVal, jsmurf, jules, lowpassFilter, improveSmallScales, createLocalList

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
## idealisation of the gramicidin A recordings given by gramA with hilde
# the used filter
filter <- lowpassFilter(type = "bessel", param = list(pole = 4L, cutoff = 1e3 / 1e4),
                        sr = 1e4)

# idealisation by HILDE assuming homogeneous noise
# this call requires a Monte-Carlo simulation
# and therefore might last a few minutes,
# progress of the Monte-Carlo simulation is reported
idealisation <- hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
                      startTime = 9, messages = 10)

# any second call should be much faster
# as the previous Monte-Carlo simulation will be loaded
hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR", startTime = 9)

# HILDE allowing heterogeneous noise
hilde(gramA, filter = filter, family = "hjsmurf", method = "2Param",
      startTime = 9, messages = 10, r = 100)
# r = 100 is used to reduce its run time,
# this is okay for illustration purposes, but for precise results
# a larger number of Monte-Carlo simulations is recommend

# much larger significance level alpha1 for a larger detection power
# in the refinement step on small temporal scales,
# but also with the risk of detecting additional artefacts
hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
      alpha1 = 0.9, alpha2 = 0.9, startTime = 9)

# getCritVal was called in hilde, can be called explicitly
# for instance outside of a for loop to save run time
q2 <- getCritVal(length(gramA), filter = filter, family = "LR")
identical(hilde(gramA, filter = filter, family = "jsmurfPS",
                method = "LR", startTime = 9, q2 = q2), idealisation)

# both steps of HILDE can be called separately
fit <- jsmurf(gramA, filter = filter, family = "jsmurfPS", alpha = 0.01,
              startTime = 9, locationCorrection = "none")
deconvolution <- improveSmallScales(fit, data = gramA, method = "LR", filter = filter,
                                    startTime = 9, messages = 100)
attr(deconvolution, "q") <- NULL
identical(deconvolution, idealisation)

# more detailed output
each <- hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
              startTime = 9, output = "each")

every <- hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
               startTime = 9, output = "every")

identical(idealisation, each$idealization)
idealisationEvery <- every$idealization[[3]]
attr(idealisationEvery, "noDeconvolution") <- attr(every$idealization,
                                                   "noDeconvolution")
identical(idealisation, idealisationEvery)

identical(each$fit, fit)
identical(every$fit, fit)


## zoom into a single event
## similar to (Pein et al., 2018, Figure 2 lower left panel)
plot(time, gramA, pch = 16, col = "grey30", ylim = c(20, 50),
     xlim = c(10.40835, 10.4103), ylab = "Conductance in pS", xlab = "Time in s")

# idealisation
lines(idealisation, col = "red", lwd = 3)

# idealisation convolved with the filter
ind <- seq(10.408, 10.411, 1e-6)
convolvedSignal <- lowpassFilter::getConvolution(ind, idealisation, filter)
lines(ind, convolvedSignal, col = "blue", lwd = 3)

# for comparison, fit prior to the improvement step
# does not contain the event and hence fits the recorded data points badly
# fit
lines(fit, col = "orange", lwd = 3)

# fit convolved with the filter
ind <- seq(10.408, 10.411, 1e-6)
convolvedSignal <- lowpassFilter::getConvolution(ind, fit, filter)
lines(ind, convolvedSignal, col = "darkgreen", lwd = 3)


## zoom into a single jump
plot(9 + seq(along = gramA) / filter$sr, gramA, pch = 16, col = "grey30",
     ylim = c(20, 50), xlim = c(9.6476, 9.6496), ylab = "Conductance in pS",
     xlab = "Time in s")

# idealisation
lines(idealisation, col = "red", lwd = 3)

# idealisation convolved with the filter
ind <- seq(9.647, 9.65, 1e-6)
convolvedSignal <- lowpassFilter::getConvolution(ind, idealisation, filter)
lines(ind, convolvedSignal, col = "blue", lwd = 3)

# idealisation with a wrong filter
# does not fit the recorded data points appropriately
wrongFilter <- lowpassFilter(type = "bessel",
                             param = list(pole = 6L, cutoff = 0.2),
                             sr = 1e4)
# the needed Monte-Carlo simulation depends on the number of observations and the filter
# hence a new simulation is required (if called for the first time)
idealisationWrong <- hilde(gramA, filter = wrongFilter, family = "jsmurfPS",
                           method = "LR", startTime = 9, messages = 10)

# idealisation
lines(idealisationWrong, col = "orange", lwd = 3)

# idealisation convolved with the filter
ind <- seq(9.647, 9.65, 1e-6)
convolvedSignal <- lowpassFilter::getConvolution(ind, idealisationWrong, filter)
lines(ind, convolvedSignal, col = "darkgreen", lwd = 3)


# simulation for a larger number of observations can be used (nq = 3e4)
# does not require a new simulation as the simulation from above will be used
# (if the previous call was executed first)
hilde(gramA[1:2.99e4], filter = filter, family = "jsmurfPS", method = "LR",
      startTime = 9, nq = 3e4)
# note that arguments to compute critical values are used to compute q1 and q2
# if this is not wanted, getCritVal can be called separately
q1 <- getCritVal(length(gramA[1:2.99e4]), filter = filter, family = "jsmurfPS",
                 messages = 100, r = 1e3)
hilde(gramA[1:2.99e4], filter = filter, family = "jsmurfPS", method = "LR",
      q1 = q1, startTime = 9, nq = 3e4) # nq = 3e4 is only used to compute q2

# simulation of type "vectorIncreased" for n1 observations can only be reused 
# for n2 observations if as.integer(log2(n1)) == as.integer(log2(n2))
# no simulation is required, since a simulation of type "matrixIncreased"
# will be loaded from the fileSystem
# this call also saves a simulation of type "vectorIncreased" in the workspace  
hilde(gramA[1:1e4], filter = filter, family = "jsmurfPS", method = "LR",
      startTime = 9, nq = 3e4)

# the above calls saved and (attempted to) load Monte-Carlo simulations
# in the following call the simulations will neither be saved nor loaded
# Monte-Carlo simulations are required for q1 and for q2
hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
      startTime = 9, messages = 10, r = 100,
      options = list(load = list(), save = list()))

# with given standard deviation
sd <- stepR::sdrobnorm(gramA, lag = filter$len + 1)
identical(hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
                startTime = 9, sd = sd), idealisation)

# with less regularisation of the correlation matrix
hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
      startTime = 9, regularization = 0.5)

# with estimation of the level of long segments by the mean
# but requiring 30 observations for it
hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
      startTime = 9, localValue = mean, thresholdLongSegment = 30)

# with one refinement step less, but with a larger grid
# progress of the deconvolution is reported
# potential warning for no deconvolution is suppressed
hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
      startTime = 9, messages = 100,
      lengths = c(3:5, 8, 11, 16, 20),
      gridSize = c(1 / filter$sr, 1 / 10 / filter$sr),
      windowFactorRefinement = 2, report = TRUE,
      suppressWarningNoDeconvolution = TRUE)

# pre-computation of certain quantities using createLocalList
# this saves run time if hilde or (improveSmallScales) is called more than once
# localList is passed via ... to improveSmallScales
localList <- createLocalList(filter = filter, method = "LR")
identical(hilde(gramA, filter = filter, family = "jsmurfPS", method = "LR",
                startTime = 9, localList = localList), idealisation)

clampSeg documentation built on Aug. 25, 2020, 5:07 p.m.