knitr::opts_chunk$set(echo = TRUE)

Background

The purpose of this article is to provide examples and explanations for the usage of RAWSmet and its functionality. This will cover setting up the rawsDataDir, downloading data from WRCC and FW13, and utility functions such as raws_filter~(), raws_get~(), and raws_distinct().

Using the rawsDataDir

Similar to data generated by MazamaSpatialUtils, RAWS data gathered by RAWSmet is sometimes large and takes some time to process. To prevent having to download and parse the same data multiple times, RAWSmet's loading functions automatically save new data to the rawsDataDir and load data from this directory if a user requests it again.

The following code will illustrate how to set up the rawsDataDir.

library(RAWSmet)

setRawsDataDir("~/Data/RAWS")

Downloading RAWS data

Downloading data from individual stations

RAWSmet is able to download live RAWS data from the WRCC in addition to archival FW13 RAWS data. Data from these sources can be downloaded with their respective loading functions, wrcc_loadYear() and cefa_load(). Additionally, station metadata from these two sources can be found with wrcc_loadMeta() and cefa_loadMeta() respectively.

wa_meta <- wrcc_loadMeta(stateCode = "WA")
cefa_meta <- cefa_loadMeta()

head(wa_meta)
head(cefa_meta)

It is important to note that stations provided by the WRCC have an associated wrccID and stations provided by FW13 have a nwsID. Many WRCC stations also have a nwsID which can be used to find data for this station in the FW13 database.

Also note that a valid password must be provided for accessing WRCC data.

RAWS data will always be stored in a raws_timeseries object. This object is a list containing two dataframes. The first, meta, contains all of the metadata for the station represented by the raws_timeseries object. The second, data contains hourly measurements from this station.

The following code will show how to download data from both the WRCC and FW13.

enumclaw_meta <- wa_meta %>% dplyr::filter(locationName == "Enumclaw")

wrccID_enumclaw <- enumclaw_meta$wrccID
nwsID_enumclaw <- enumclaw_meta$nwsID

print(sprintf("wrccID: %s, nwsID: %s", wrccID_enumclaw, nwsID_enumclaw))

enumclaw_wrcc <- wrcc_loadYear(wrccID = wrccID_enumclaw,
                               meta = wa_meta,
                               year = 2020,
                               password  = MY_PASSWORD)

enumclaw_fw13 <- cefa_load(nwsID = nwsID_enumclaw,
                           meta = cefa_meta)

head(enumclaw_wrcc$meta)
head(enumclaw_wrcc$data)

head(enumclaw_fw13$meta)
head(enumclaw_fw13$data)

Downloading data from multiple stations

Additionally, one may download data from multiple stations with the wrcc_loadMultiple() and cefa_loadMultiple() functions. The resulting object will be of type raws_list and will be a list of raws_timeseries objects.

For example,

# 3 wrccID's in Washington
wrccIDs <- c("waWASH", 'waWENU', 'waWABE')

# Load data from these 3 stations
rawsList <- wrcc_loadMultiple(wrccIDs <- wrccIDs, meta = wa_meta, year = 2020, password = MY_PASSWORD)

dplyr::glimpse(rawsList)

head(rawsList$waWENU$data)

Utility functions

Filtering raws_timeseries objects

RAWSmet provides many utility functions for working with raws_timeseries objects. The first of these functions are for filtering raws_timeseries objects and raws_list's.

startdate <- MazamaCoreUtils::parseDatetime(20200901, timezone = "America/Los_Angeles")
enddate <- MazamaCoreUtils::parseDatetime(20201001, timezone = "America/Los_Angeles")

# ----- raws_timeseries -----

# raws_filterDate
enumclaw_202009 <- enumclaw_wrcc %>%
  raws_filterDate(
    startdate = startdate,
    enddate = enddate,
    timezone = "America/Los_Angeles"
    )

print(sprintf("First observation: %s, last observation: %s", 
              enumclaw_202009$data$datetime[1],
              enumclaw_202009$data$datetime[nrow(enumclaw_202009$data)]))

# raws_filter
enumclaw_daytime <- enumclaw_wrcc %>%
  raws_filter(solarRadiation > 0)

print(sprintf("Min solar radiation before filter: %s, min solar radiation after filter: %s",
              min(enumclaw_wrcc$data$solarRadiation, na.rm = TRUE),
              min(enumclaw_daytime$data$solarRadiation, na.rm = TRUE)))

# ----- raws_list -----

rawsList_202009 <- rawsList %>%
  rawsList_filterDate(
    startdate = startdate,
    enddate = enddate,
    timezone = "America/Los_Angeles"
  )

rawsList_daytime <- rawsList %>%
  rawsList_filter(solarRadiation > 0)

Validating raws_timeseries objects

The next type of utility functions are for validating raws_timeseries objects.

# Check if an object is a valid raws_timeseries
raws_isRaws(enumclaw_wrcc)
raws_isRaws(enumclaw_fw13)
raws_isRaws(c("some", "other", "object"))

# Check if a raws_timeseries is empty
raws_isEmpty(enumclaw_wrcc)

# Get an empty list by filtering by an impossible value
enumclaw_empty <- enumclaw_wrcc %>%
  raws_filter(humidity == 1000)

raws_isEmpty(enumclaw_empty)

# Get a new raws_timeseries by removing duplicate observations
enumclaw_wrcc_distinct <- raws_distinct(enumclaw_wrcc)

Extracting metadata/data from raws_timeseries objects

The finaly type of utility functions are for extracting the meta and data dataframes from a raws_timeseries object.

enumclaw_meta <- raws_getMeta(enumclaw_wrcc)

names(enumclaw_meta)

enumclaw_data <- raws_getData(enumclaw_wrcc)

names(enumclaw_data)

Additionally, raws_getData can format the data dataframe correctly for use with the openair package by passing forOpenair = TRUE.

enumclaw_data_openair <- raws_getData(enumclaw_wrcc, forOpenair = TRUE)

names(enumclaw_data_openair)


MazamaScience/RAWSmet documentation built on May 6, 2023, 6:57 a.m.