knitr::opts_chunk$set(fig.width = 7, fig.height = 5)

This article illustrates how to create a local archive of pre-generated data files for PurpleAir sensors, in order to use the AirSensor package off-line. For the purpuse of this tutorial, we are going to create a local archive for the Methow Valley's sensors by loading pre-generated PurpleAir synoptic and time searies files from an internal archive base url mantained by the AirSensor package. This is the fastest way for obtaning sensors data but, alternatively, you could download the data directly from the PurpleAir website. More info on this alternative method is available in the PurpleAir Synoptic Data and PurpleAir Time Series Data articles.

Section 1: loading pre-generated files to create a flat archive

Step 1 Load the AirSensor and dplyr libraries. You may need to install other necessary packages first (see AirSensor R Packeage for more details).

library(AirSensor)
library(dplyr)

Step 2 Set the archive base directory to NULL. This step is necessary only if you previously set a base directory, since it would take precedence over the base url we are going to set in Step 3.

setArchiveBaseDir(NULL)

Step 3 Set the archive base url to https://airsensor.aqmd.gov/PurpleAir/v1/.

setArchiveBaseUrl("https://airsensor.aqmd.gov/PurpleAir/v1/")

Step 4 Create a pas (PurpleAir Synoptic) object for the Methow Valley'sensors. The pas object contains all the sensor's data and metadata, including latitude and longitude which allow us to easily visualize the sensors on a map. You can then click the sensors on the map to see the most important metadata and data.

mvcaa <-
  pas_load() %>% # load the most recent archived 'pas'(PurpleAir Synoptic)
  pas_filter(stringr::str_detect(label, "MV Clean Air Ambassador"))
pas_leaflet(mvcaa) # dynamic map of the sensors.


Step 5 Create a list of unique time series identifiers or DeviceDeploymentIDs.

ID <- pas_getDeviceDeploymentIDs(pas = mvcaa)

If you only want to obtain a list of unique time series identifiers, skip Step 4 and 5, and follow the example below.

ID <- pas_load() %>%
  pas_getDeviceDeploymentIDs("MV Clean Air Ambassador")


Now that we have our list of unique time series identifiers, we can load the PurpuleAir time series data (heareafter pat) to build our local archive. Here we have two options: 1) build a flat archive where we simply store files; 2) build an archive using the well-known structure (aka archive base url structure). The latter will allow you to use AirSensor ~load functions for further off-line analysis, including:

If you choose option 1), got to Step 6; if you choose option 2) jump to Section 2.

Step 6 Load and save pas and pat files in a flat archive.

my_archive <- "~/my_archive" # type here your the directory where you want to 
                             # save the files 

# Load and save pat files for September 2020
for (id in ID){
  filename = paste0(my_archive, "/", id, ".rda")
  pat = pat_createNew(id = id,
                      pas = mvcaa,
                      startdate = 20200901, 
                      enddate = 20200930)
  save(pat, file = filename)
} 

# save the pas object created in Step 4 in your flat archive as an .rda file
save(mvcaa, file = paste0(my_archive, "/mvcaa.rda"))

# to load files from your flat archive you can use the get(load()) function 
mvcaa <- get(load(paste0(my_archive,"/mvcaa.rda")))

Section 2: loading pre-generated files and saving them using the well-known archive structure.

At this point you have completed Step 1 through 5 of Section 1, thus you have the pas data frame (mvcaa) and the ID list ready in your R environemnt and you can proceed with the next steps.

Step 1 Create a local archive using the well-known archive structure:

# create the pas directory
pas_2021 <- "pas/2021"
dir.create(file.path(my_archive, pas_2021), recursive = TRUE)

# create the pat directory for September 2020
pat_202009 <- "pat/2020/09"
dir.create(file.path(my_archive, pat_202009), recursive = TRUE)

# save the new pat directory as an object (we'll need this in Step 3)
pat_202009 = paste0(my_archive,"/", pat_202009)

Step 2 Save the pas object previously created (Step 4, Section 1) as an .rda file in your well known archive structure.

save(mvcaa, file = paste0(my_archive, "/pas/2021/pas_20210301.rda"))

NOTE: the date in the file name correspond to the day I loaded the data. I did it on March 1, 2021.

Step 3 Load and save the pat objects as .rda files in your well known archive structure. You'll need the ID list previously created (Step 5, Section 1).

NOTE: If one or more sensors do not have data available for the time range specified in the function, you'll get Error in pat_filterDatetime(., startdate = dateRange[1], enddate = dateRange[2], :Parameter 'pat' has no data. This error message does not communicate which sensor has no data available, and we are working on improving it to make it more specific.

for (id in ID){
  filename = paste0(pat_202009, "/", "pat_", id, "_202009", ".rda")
  pat = pat_createNew(id = id,
                      pas = mvcaa,
                      startdate = 20200901,
                      enddate = 20200930)
  save(pat, file = filename)
}

Step 4 Set your archive base directory and test if you can load the data from your archive using the AirSensor functions.

NOTE: to load pat data it is important to always include the startdate and enddate arguments. For consitency, I suggest you to include the pas object as well.

# Set your archive base directory
setArchiveBaseDir(getwd())

# Check that your archive base directory is correct
getArchiveBaseDir()

# load the pas file (refer to Step 2)
pas <- pas_load(20210301) 

# load one of the pat files to make sure you can pull data from your well-known archive 
pat <- pat_load(id = "ab5dca99422f2c0d_13669",
         pas = pas,
         startdate = 20200901,
         enddate = 20200930, 
         timezone = "America/Los_Angeles")

Step 5 Now that you have your well-known archive all set, have fun working off-line by using the AirSensor functions. In the example below, I'm plotting pat data using the pat_multiplot() and pat_scatterPlotMatrix().

pat_multiplot(pat)
pat_scatterPlotMatrix(pat)


MazamaScience/AirSensor documentation built on April 28, 2023, 11:16 a.m.