Description Usage Arguments Value Note Examples
Outlier detection using a Median Average Deviation "Hampel" filter. This function applies a rolling Hampel filter to find those points that are very far out in the tails of the distribution of values within the window.
The thresholdMin
level is similar to a sigma value for normally
distributed data. The default threshold setting thresholdMin = 8
identifies points that are extremely unlikely to be part of a normal
distribution and therefore very likely to be an outlier. By choosing a
relatively large value for 'thresholdMin“ we make it less likely that we
will generate false positives.
The default setting of the window size windowSize = 15
means that 15 samples
from a single channel are used to determine the distribution of values for
which a median is calculated. Each PurpleAir channel makes a measurement
approximately every 120 seconds so the temporal window is 15 * 120 sec or
approximately 30 minutes. This seems like a reasonable period of time over
which to evaluate PM2.5 measurements.
Specifying replace = TRUE
allows you to perform smoothing by
replacing outliers with the window median value. Using this technique, you
can create an highly smoothed, artificial dataset by setting
thresholdMin = 1
or lower (but always above zero).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | pat_outliers(
pat = NULL,
windowSize = 15,
thresholdMin = 8,
replace = FALSE,
showPlot = TRUE,
data_shape = 18,
data_size = 1,
data_color = "black",
data_alpha = 0.5,
outlier_shape = 8,
outlier_size = 1,
outlier_color = "red",
outlier_alpha = 1
)
|
pat |
PurpleAir Timeseries pat object. |
windowSize |
Integer window size for outlier detection. |
thresholdMin |
Threshold value for outlier detection. |
replace |
Logical specifying whether replace outliers with the window median value. |
showPlot |
Logical specifying whether to generate outlier detection plots. |
data_shape |
Symbol to use for data points. |
data_size |
Size of data points. |
data_color |
Color of data points. |
data_alpha |
Opacity of data points. |
outlier_shape |
Symbol to use for outlier points. |
outlier_size |
Size of outlier points. |
outlier_color |
Color of outlier points. |
outlier_alpha |
Opacity of outlier points. |
A pat object with outliers replaced by median values.
Additional documentation on the algorithm is available in
seismicRoll::findOutliers()
.
1 2 3 4 5 | library(AirSensor)
example_pat %>%
pat_filterDate(20180801, 20180815) %>%
pat_outliers(replace = TRUE, showPlot = TRUE)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.