This vignette examines different failure modes we have found in the data from PurpleAir (PA-II) sensors.
knitr::opts_chunk$set(fig.width=7, fig.height=5) library(AirSensor) setArchiveBaseUrl("http://data.mazamascience.com/PurpleAir/v1") initializeMazamaSpatialUtils() pas <- example_pas
The A channel readings are moderately more noisy than those of the B channel.
pat_a_noisy_1 <- example_pat_failure_A pat_internalFit(pat_a_noisy_1)
Channel A here often measures above B at the same moment in time, as shown by the scatterPlot with the vertical clump near the origin instead of along the angled regression line.
Channel A readings show extreme levels of noise.
older_pas <- pas_load(20191001) noisy_2_id <- pas_getDeviceDeploymentIDs(older_pas, pattern = "SCAP_14") pat_a_noisy_2 <- pat_createNew(id = noisy_2_id, label = "SCAP_14", pas = older_pas, startdate = 20190701, enddate = 20190708, timezone = "America/Los_Angeles") pat_multiPlot(pat_a_noisy_2, plottype = "pm25_over")
While a small amount of noise is natural when measuring particulate matter, sometimes the noise level goes far beyond what is allowable. Here we see the A channel looks like a cloud of points compared to the much more consistent channel B. A faint wave pattern can still be identified, however.
Channel A shows a sudden, but short-lived jump in PM2.5 readings.
pat_a_jump <- example_pat_failure_B pat_multiPlot(pat_a_jump, plottype = "pm25_over")
This plot shows a jump that seems to retain a consistent wave pattern for a while rather than just being random noise. This could possibly be a temporary mix of the "Matches Humidity" failure mode.
The A channel PM2.5 sensor starts reflecting humidity readings instead of PM levels.
humidity_id <- pas_getDeviceDeploymentIDs(pas, pattern = "BikeSGV - West Pasadena") pat_a_humidity <- pat_createNew(id = humidity_id, label = "BikeSGV - West Pasadena", pas, startdate = "2019-04-16", enddate = "2019-04-24", timezone = "America/Los_Angeles") pat_multiPlot(pat_a_humidity)
You can see in the multiplot the clear disconnect between the two PM2.5 channels. Both sensors appear to agree with each other until channel A suddenly jumps into the thousands and starts tracing the trend of the humidity data (plotted directly below).
pat_scatterPlotMatrix(pat_a_humidity)
A glance at these scatterPlots gives a us another look at just how uncorrelated the A and B channels are, while the relationships for channel A with temperature and humidity are abnormally well-defined. It's actually not very clear which of the auxiliary sensors the A channel is reflecting since temperature and relative humidity are naturally correlated themselves.
A more in-depth analysis of this issue is provided in
local_examples/bikesgv_story.Rmd
.
The A channel is centered around a particular level but is sometimes affected by humidity when it goes past a certain threshold.
id <- pas_getDeviceDeploymentIDs(pas, pattern = "SCEM_05") pat <- pat_createNew(id = id, label = "SCEM_05", pas, startdate = 20190701, enddate = 20190710, timezone = "America/Los_Angeles") pat_multiPlot(pat)
The multiplot shows the A and B channels have very different readings, similar to the "Matches Humidity" failure mode. The strange thing here though is that the A channel is mostly flat, with only the occasional spike when the humidity measures very high. Let's see what number it is centered around:
plot(pat$data$datetime, pat$data$pm25_A, ylim = c(3325, 3340), pch = 15, cex = 0.6, col=adjustcolor("black", 0.2), xlab = "2019", ylab = "PM2.5 A") temp <- table(as.vector(pat$data$pm25_A)) print(paste0("Mode value: ", names(temp)[temp == max(temp)]))
Although there is plenty of noise between 3325 and 3340 ug/m3, a very clear, very straight line of points is visible at exactly 3333.0 ug/m3.
Channel B measures no particulate matter at all.
zero_id <- pas_getDeviceDeploymentIDs(pas, pattern = "SCAP_46") pat_b_zero <- pat_createNew(id = zero_id, label = "SCAP_46", pas, startdate = "2019-07-01", enddate = "2019-07-08", timezone = "America/Los_Angeles") pat_multiPlot(pat_b_zero, sampleSize = NULL) simple <- dplyr::select(pat_b_zero$data, datetime, pm25_A, pm25_B) head(simple)
In this case one may wish to work with the A channel data only
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.