PurpleAir Failure Modes

This vignette examines different failure modes we have found in the data from PurpleAir (PA-II) sensors.

knitr::opts_chunk$set(fig.width=7, fig.height=5)
library(AirSensor)
setArchiveBaseUrl("http://data.mazamascience.com/PurpleAir/v1")
initializeMazamaSpatialUtils()
pas <- example_pas

Channel A Moderate Noise

The A channel readings are moderately more noisy than those of the B channel.

pat_a_noisy_1 <- example_pat_failure_A
pat_internalFit(pat_a_noisy_1)

Channel A here often measures above B at the same moment in time, as shown by the scatterPlot with the vertical clump near the origin instead of along the angled regression line.

Channel A Extreme Noise

Channel A readings show extreme levels of noise.

older_pas <- pas_load(20191001)
noisy_2_id <- pas_getDeviceDeploymentIDs(older_pas, pattern = "SCAP_14")
pat_a_noisy_2 <- pat_createNew(id = noisy_2_id,
                               label = "SCAP_14", 
                               pas = older_pas, 
                               startdate = 20190701, 
                               enddate = 20190708,
                               timezone = "America/Los_Angeles")
pat_multiPlot(pat_a_noisy_2, plottype = "pm25_over")

While a small amount of noise is natural when measuring particulate matter, sometimes the noise level goes far beyond what is allowable. Here we see the A channel looks like a cloud of points compared to the much more consistent channel B. A faint wave pattern can still be identified, however.

Channel A Jumps

Channel A shows a sudden, but short-lived jump in PM2.5 readings.

pat_a_jump <- example_pat_failure_B
pat_multiPlot(pat_a_jump, plottype = "pm25_over")

This plot shows a jump that seems to retain a consistent wave pattern for a while rather than just being random noise. This could possibly be a temporary mix of the "Matches Humidity" failure mode.

Channel A Matches Humidity

The A channel PM2.5 sensor starts reflecting humidity readings instead of PM levels.

humidity_id <- pas_getDeviceDeploymentIDs(pas, 
                                          pattern = "BikeSGV - West Pasadena")
pat_a_humidity <- pat_createNew(id = humidity_id,
                                label = "BikeSGV - West Pasadena", 
                                pas, 
                                startdate = "2019-04-16", 
                                enddate = "2019-04-24",
                                timezone = "America/Los_Angeles")
pat_multiPlot(pat_a_humidity)

You can see in the multiplot the clear disconnect between the two PM2.5 channels. Both sensors appear to agree with each other until channel A suddenly jumps into the thousands and starts tracing the trend of the humidity data (plotted directly below).

pat_scatterPlotMatrix(pat_a_humidity)

A glance at these scatterPlots gives a us another look at just how uncorrelated the A and B channels are, while the relationships for channel A with temperature and humidity are abnormally well-defined. It's actually not very clear which of the auxiliary sensors the A channel is reflecting since temperature and relative humidity are naturally correlated themselves.

A more in-depth analysis of this issue is provided in local_examples/bikesgv_story.Rmd .

Channel A Magic Number

The A channel is centered around a particular level but is sometimes affected by humidity when it goes past a certain threshold.

id <- pas_getDeviceDeploymentIDs(pas, pattern = "SCEM_05")
pat <- pat_createNew(id = id,
                     label = "SCEM_05",
                     pas,
                     startdate = 20190701, 
                     enddate = 20190710,
                     timezone = "America/Los_Angeles")
pat_multiPlot(pat)

The multiplot shows the A and B channels have very different readings, similar to the "Matches Humidity" failure mode. The strange thing here though is that the A channel is mostly flat, with only the occasional spike when the humidity measures very high. Let's see what number it is centered around:

plot(pat$data$datetime, pat$data$pm25_A, 
     ylim = c(3325, 3340), 
     pch = 15, cex = 0.6, col=adjustcolor("black", 0.2),
     xlab = "2019", ylab = "PM2.5 A")

temp <- table(as.vector(pat$data$pm25_A))
print(paste0("Mode value: ", names(temp)[temp == max(temp)]))

Although there is plenty of noise between 3325 and 3340 ug/m3, a very clear, very straight line of points is visible at exactly 3333.0 ug/m3.

Channel B is Zero

Channel B measures no particulate matter at all.

zero_id <- pas_getDeviceDeploymentIDs(pas, pattern = "SCAP_46")

pat_b_zero <- pat_createNew(id = zero_id, 
                            label = "SCAP_46", 
                            pas,
                            startdate = "2019-07-01", 
                            enddate = "2019-07-08",
                            timezone = "America/Los_Angeles")
pat_multiPlot(pat_b_zero, sampleSize = NULL)

simple <- dplyr::select(pat_b_zero$data, datetime, pm25_A, pm25_B)
head(simple)

In this case one may wish to work with the A channel data only



Try the AirSensor package in your browser

Any scripts or data that you put into this service are public.

AirSensor documentation built on March 13, 2021, 1:07 a.m.