| Dimodal | R Documentation |
Dimodal studies the modality of data using its spacing. The presence of peaks or local increases in it indicates the data is multi-modal and locates the anti-modes. Flats or consistent spacing cover the modes. Dimodal finds these features after smoothing the spacing by low-pass filtering, which supports discrete or heavily quantized data, or in the interval spacing. Several tests, using parametric models, runs, and bootstrap sampling, evaluate these features.
Dimodal(x, opt=Diopt())
## S3 method for class 'Dimodal'
print(x, feature=c('peaks', 'flats'), ...)
## S3 method for class 'Dimodal'
summary(object, feature=c('peaks', 'flats'), ...)
## S3 method for class 'Dimodal'
plot(x, show=c('lp', 'histogram', 'diw'),
feature=c('peaks', 'flats'), opt=Diopt(), ...)
x |
for |
object |
an object of class |
opt |
local version of options to guide analysis |
feature |
display only the indicated feature(s) in all methods that were run, or for plots mark only them in the graph |
show |
plot the low-pass spacing, a histogram of the raw data, and/or the interval spacing, in separate graphs in the order given |
... |
extra arguments, ignored for all methods |
Changes in the spacing of data can indicate a change in its modality, and
Dimodal is a general interface to feature detectors and tests to
evaluate such changes. Spacing, the difference between consecutive order
statistics or the delta after sorting the data, takes on a ‘U’ form,
increasing rapidly in the tails and remaining stable in the center (for
single-sided variates it forms half the U; uniform variates have constant
spacing). The transition between modes is marked by local increases in the
spacing while the center of modes see stable values. Dimodal therefore
looks for local maxima or peaks in the spacing, or locally flat regions.
The spacing, designated Di, is often very noisy, and may be quantized
to a few values if the data is discrete or taken with limited precision.
Smoothing is necessary, which Dimodal can do either by apply a low-pass
(lp) filter or by taking the difference over more than one order
statistic. The latter is called the interval spacing Diw and is
generated as a difference with lag; it is equivalent to a running mean or
rectangular filter of the raw spacing. The recommended low-pass filter is
a Kaiser kernel, which offers good high-frequency suppression and main lobe
width; other available filters are the Bartlett or triangular (synonyms),
Hanning, Hamming, Gaussian or normal (synonyms), and Blackman. Filtering
is done by convolving the data with the filter's kernel, rather than moving
to the Fourier domain. Points at the start and finish that are partially
covered by the kernel or interval are set to NA and attributes attached to
the data give the valid range. Indexing from the two spacings is different.
The low-pass kernel is centered, with partial overlaps at both ends. The
interval spacing is defined as trailing from the upper index, which runs
to the end of the data, so the partial overlap occurs only at the start.
This will be seen in the position of the smoothed curves when plotting
results and the shift in indices needed to align the two schemes will be
printed with the data summary. The raw values corresponding to a feature
automatically compensate for the difference.
The feature detectors find.peaks and find.flats have
separate help pages describing their algorithms and the parameters that
control their analysis. These features are local and therefore not only
indicate whether data may be multi-modal, but provide the location of the
modes and the transitions between them.
Dimodal uses three main strategies to evaluate the features. First, the
models tests are Dipeak.test and Diflat.test, with critical
values at a significance level also available. These models are based on
simulations of the peak heights and flat lengths in a univariate null
distribution and offer a parametric assessment of their significance. They
are less conservative than other modality detectors. Second, the
bootstrap test is Diexcurht.test. The bootstrap simulates the
features drawing from a pool of the difference of the spacing, estimating
their probability without assuming any underlying distribution. Finally,
the runs tests are Dinrun.test, Dirunlen.test, and
Dipermht.test. Quantizing the filtered spacing into a few levels
by taking the sign of the difference (in other words, if the signal is
increasing, decreasing, or constant) allows us to consider runs in the
symbols. We can test how many there are, or the longest, or if a
permutation of them recreates the feature.
A fourth strategy, using changepoint detectors on the raw spacing to detect transitions between modes and anti-modes, is not included in this version of Dimodal. See the package help page or DESCRIPTION file for the location of the full version.
The bootstrap test extends a peak to its support, defined by the
"peak.fhsupp" option, a fraction of the peak's height. A value of 0.9
is enough to back the away from minima placed in a long flat while not
distorting the peak's width if the minima are well-defined. 0.5
corresponds to Full Width at Half Maximum (FWHM), and 1.0 extends the
peak to the minima.
The analysis of each feature is gathered into separate S3 class objects
which support printing and marking plots.
The generic functions on the Dimodal
result route to these objects if they are selected by the features
argument. A plot may contain the filtered spacing or interval spacing plus
a histogram of the raw data, with features annotated on each. It uses layout
to create a row of the shown graphs, as specified by the show
argument. The histogram annotations will come from the first, leftmost,
spacing shown.
The raw data must be numeric or integer. Non-finite values, including NA, will be dropped.
Dimodal needs a complete list of options for the opt argument. Do
not make changes in the call, as Diopt will return only the changed
values. Use Diopt.local instead.
The option "analysis" controls which smoothed spacing to generate,
one or both of 'lp' and 'diw'. If none of these are specified
the data will contain only the spacing and mid-quantile function,
without any features or their analysis.
Dimodal uses options "lp.param" and "diw.param" to override
the detector options for each method, and "lp.tests" and
"diw.tests" to determine which feature tests to carry out. If
these are empty lists then the data will contain the smoothed spacing but
there will be no features. While generating the data it uses options
"lp.kernel" and "lp.window" to set up the low-pass filter,
and "diw.window" for the interval width. It uses "excur.ntop"
when creating the base set of draws for excursion tests. Option
"data.midq" determines the approximation method (type argument to
the midquantile function), when converting indices in the spacing back
to order statistics.
The default values of the detector options come from the development of the low-pass models. We do not know how different values will affect the models. The interval spacing is much rougher than low-pass filtering, which may require looser ripple and height parameters to find any flat, or reduce the number of peaks. The excursion tests will accommodate this.
A list assigned to class "Dimodal" with elements
data |
an object of class |
lp.peaks |
an object of class |
lp.flats |
an object of class |
diw.peaks |
an object of class |
diw.flats |
an object of class |
opt |
the list passed as the opt argument, per |
These elements will have empty data structures if the analysis is not run.
Dimodal will automatically call shiftID.place on each detector's
results and will summarize the tests, as described with each data class.
Dimodal adds an attribute "source" to each of the features, with
value LP, Diw, or Di.
Diopt for the parameters controlling the analysis.
find.peaks, find.flats for feature detection.
Dipeak.test, Diflat.test for parametric models to
evaluate the features, Diexcurht.test for a bootstrap test of
feature significance, Dinrun.test, Dirunlen.test for
tests of runs (here for sequences in the sign of the difference in the
interval spacing), and Dipermht.test for a permutation test
of the runs making a feature.
Didata, Dipeak, Diflat for the
data structures generated by the feature detectors and their evaluation.
center.diw to further shift the position of interval spacing
features to the middle of the interval to align with low-pass features.
match.features to identify common peaks and flats in both
spacings.
shiftID.place to move indices in either spacing to the
original data grid and add the corresponding raw values.
midquantile for the mid-quantile mapping from index to raw
data.
## The interval spacing is noisy with the default options, so require a
## larger peak height with a temporary value to Diopt.
oldopt <- Diopt(diw.param=list(peak.fht=0.125))
## Run the analysis.
m <- Dimodal(faithful$waiting)
## If printing the results, the interval spacing peaks have a probability
## just under 0.05 but fail the acceptance levels.
summary(m)
## Details about the peaks in both spacings.
print(m, feature="peaks")
## We find one peak in both spacings, but only the low-pass is significant.
match.features(m)
## Three plots side by side. The limited resolution of the data is clear
## in the interval spacing.
dev.new(width=12, height=4) ; plot(m)
## Restore the old option values. Diopt(NULL) returns to defaults.
oldopt <- Diopt(oldopt)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.