layout: true


# Copyright 2019 Province of British Columbia
# 
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# 
# http://www.apache.org/licenses/LICENSE-2.0
# 
# Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and limitations under the License.
options(htmltools.dir.version = FALSE)
options(width = 90)
options(max_print = 5)

knitr::opts_chunk$set(
  collapse = TRUE,
  echo = FALSE,
  comment = "#>",
  fig.path = "graphics/prod/figs"
)

options(scipen = 10)
library(tidyhydat)
library(weathercan)
library(knitr)
library(tidyverse)
library(lubridate)
library(corrr)
library(sf)
library(rnaturalearth)
library(fontawesome)
bg_black <- "#272822"

theme_set(theme_void() %+replace%
            theme(legend.text = element_text(colour = "white", size = 18),
                  legend.title = element_text(colour = "white", size = 18),
                  plot.background = element_rect(fill = bg_black, color = bg_black),
                  axis.text = element_text(colour = "white", size = 16),
                  axis.title = element_text(colour = "white", size = 18),
                  axis.title.y = element_text(angle = 90, vjust = 1),
                  plot.title = element_text(colour = "white", size = 22, hjust = 0)))


scale_colour_continuous <- scale_colour_viridis_c
scale_fill_continuous <- scale_fill_viridis_c
scale_colour_discrete <- scale_colour_viridis_d
scale_fill_discrete <- scale_fill_viridis_d

Outline

.VeryLarge[ - Who am I? - Learning Outcomes - Review R and RStudio and rationale behind using them - Introduce packages: - dplyr - tidyhydat - weathercan - Provide an example of using them together - tidyhydat and weathercan development - Where and how to get help in R - Questions ]


Sam Albers

.pull-left[ - Data Scientist with BC government - Environmental Scientist by training - Been using R for 10 years - Maintainer for tidyhydat, rsoi - Contributor on many other packages including weathercan - Maintainer of the Hydrology task view ] .pull-right[

Drawing

r fa(name = "twitter") @big_bad_sam
r fa(name = "github") @boshek
r fa(name = "paper-plane") sam.albers@gov.bc.ca ]


What are we hoping to learn?

.pull-left[ .VeryLarge[ - Describe visual elements of RStudio - Define and assign data to variable - Manage your workspaces and projects - Call a function - Understand the six main dplyr verbs - Overview of tidyhydat and weathercan functions - Describe usage of tidyhydat and weathercan - How to ask for help in R ] ]

.pull-right[

]


class: inverse, center, middle

Common Analysis Problems


class: center, basic

Accessing Environment and Climate Change Canada Data

include_graphics("graphics/ec_data_explorer2.gif")

11 clicks!


class: basic, center

Stakeholder/Manager: "Hey, this is a really cool analysis but we need to add five stations. Can you run it again?"

--

Make it reproducible!


Questions worth asking...

.large[ - Are your methods reproducible? - What is your analysis recipe? - Can you share it? ]

Drawing


class: inverse, left, middle

...Use R!

.pull-left[ (or more generally any programmatic code based analysis approach...) ]

Drawing


.pull-left[

What is R?

.large[ - Free and open source - Statistical programming language - Publication quality graphics - Much of the innovation occurs in contributed packages - But definitely not intimidating... ]

Some example code

all_time_greats <- c(99, 66, 4, 9)

]

-- .pull-right[

What is RStudio?

.large[ - Provides a place to write and run code - A means to organize projects - Referred to as an IDE ] --

Not guaranteed to help with this...

]


class: inverse, left, middle

R and RStudio


.pull-left[

The Problem

]

--

.pull-right[

Enter dplyr

a consistent set of verbs that help you solve the most common data manipulation challenges

]


dplyr verbs

Functions with English meanings that map directly to the action being taken when that function is called

Installation: install.packages("dplyr")

.pull-left[ - %>% a special symbol to chain operations. Read it as "then" - select() picks variables based on their names. - filter() picks cases based on their values. - summarise() reduces multiple values down to a single summary. - arrange() changes the ordering of the rows. - mutate() adds new variables that are functions of existing variables

For a offline tutorial: http://swcarpentry.github.io/r-novice-gapminder/13-dplyr/index.html ]

.pull-right[

Artwork by @allison_horst ]


class: inverse, left, middle

dplyr code break


class:basic

The objective of tidyhydat is to provide a standard method of accessing ECCC hydrometric data sources (historical and real time) using a consistent and easy to use interface that employs tidy data principles within the R project.

Drawing

Installation: install.packages("tidyhydat")


hydat::Water Survey of Canada Network

.pull-left[

stns <- hy_stations() %>% 
  filter(HYD_STATUS == "ACTIVE")

stns_sf <- st_as_sf(stns, coords = c("LONGITUDE","LATITUDE"),
             crs = 4326,
             agr= "constant") 

can <- ne_countries(country = "Canada", returnclass = "sf")

ggplot() +
  geom_sf(data = can, fill = NA) +
  geom_sf(data = stns_sf, size = 1, aes(colour = PROV_TERR_STATE_LOC)) +
  guides(colour = FALSE) +
  coord_sf(crs = 102009, datum = NA) +
  theme_void()

]

.pull-right[

r round(fs::file_size(file.path(hy_dir(), "Hydat.sqlite3"))/1E9,2) GB

r nrow(hy_stations()) stations in database

SQLite database

Self contained

]


class:basic

The objective of weathercan is to provide a standard method of accessing ECCC climate data sources using a consistent and easy to use interface that employs tidy data principles within the R project.

Drawing

Installation: install.packages("weathercan")

weathercan::Climate Data

.pull-left[

stations_unique <- stations %>% 
  select(prov, station_name, lat, lon) %>% 
  unique()



weather_stns_sf <- stations_unique %>%
  filter(!station_name == "POINT LEPREAU") %>% 
  filter(!is.na(lat), !is.na(lon)) %>% 
  st_as_sf(coords = c("lon","lat"),
             crs = 4326,
             agr= "constant") 

can <- ne_countries(country = "Canada", returnclass = "sf")

ggplot() +
  geom_sf(data = can, fill = NA) +
  geom_sf(data = weather_stns_sf, size = 1, aes(colour = prov)) +
  guides(colour = FALSE) +
  coord_sf(crs = 102009, datum = NA) +
  theme_void()

]

.pull-right[

r length(unique(stations$station_name)) stations

Available online

]

class: inverse, center, middle

Looking closer at tidyhydat and weathercan


class: basiclh

tidyhydat

Download the database:

download_hydat()

Access some flow data

flows_data <- hy_daily_flows(station_number = c("08MF005","09CD001","05KJ001","02KF005"))

What else is available in tidyhydat?

All tables in HYDAT

.Large[ - See help(package = "tidyhydat") - Realtime data - Instantaneous peaks - Daily, monthly and yearly temporal summaries - Discharge, level, sediment, particle size - Data ranges - Station metadata ]


What else is available in tidyhydat?

plot(flows_data)

What else is available in tidyhydat?

search_stn_name("fraser")

class: inverse, left, middle

tidyhydat code break


weathercan

vic_gonzales <- weather_dl(station_ids = "114", interval = "day", start = "2019-01-01", end = "2019-01-31")
vic_gonzales

What else is available in weathercan?

.Large[ - See help(package = "weathercan") - Normals - Climate normals measurements - Station metadata ]


class: inverse, left, middle

weathercan code break


Path of Hurricane Dorian

## Download the data
zip_path <- tempfile()

download.file("https://www.nhc.noaa.gov/gis/best_track/al052019_best_track.zip",
              destfile = zip_path)
unzip(zip_path, exdir = "data/dorian")

## Read in
dorian <- read_sf("data/dorian/AL052019_pts.shp") %>% 
  st_transform(3347)
north_america = ne_countries(continent = "North America", returnclass = "sf") %>% 
  st_transform(3347)
ggplot() +
  geom_sf(data = north_america, aes(fill = sovereignt), alpha = 0.3) +
  geom_sf(data = dorian, colour = "black") +
  guides(fill = FALSE) +
  theme_void()

Point where Dorian is over Canadian land

canada = ne_states(country = "Canada", returnclass = "sf") %>% 
  st_transform(3347)

dorian_canada <- st_intersection(dorian, canada)


ggplot() +
  geom_sf(data = canada, aes(fill = name), alpha = 0.3) +
  geom_sf(data = dorian_canada, colour = "purple", size = 3) +
  guides(fill = FALSE) +
  theme_void()

Nova Scotia with buffer

dorian_buffer <- st_buffer(dorian_canada, dist = 7E4)

maritimes <- canada %>% 
  filter(name %in% c("New Brunswick", "Nova Scotia", "Prince Edward Island"))


ggplot() +
  geom_sf(data = maritimes, aes(fill = name), alpha = 0.3) +
  geom_sf(data = dorian_canada, colour = "purple", size = 3) +
  geom_sf(data = dorian_buffer, colour = "orange", alpha = 0.5) +
  guides(fill = FALSE) +
  theme_void()

Hydrometric Stations

hydro_stations <- realtime_stations() %>% 
  st_as_sf(coords = c("LONGITUDE", "LATITUDE"),
           crs = 4326,
           agr = "constant") %>% 
  st_transform(3347)

climate_stations <- stations %>% 
  filter(end == 2019) %>% 
  filter(interval == "hour") %>% 
  filter(!is.na(lon), !is.na(lat)) %>% 
  st_as_sf(coords = c("lon", "lat"),
           crs = 4326,
           agr = "constant") %>% 
  st_transform(3347)

hydro_dorian <- st_intersection(hydro_stations, dorian_buffer)

climate_dorian <- st_intersection(climate_stations, dorian_buffer)

ggplot() +
  geom_sf(data = maritimes, aes(fill = name), alpha = 0.3) +
  geom_sf(data = dorian_canada, colour = "purple", size = 3) +
  geom_sf(data = dorian_buffer, colour = "orange", alpha = 0.5) +
  geom_sf(data = hydro_dorian, aes(colour = STATION_NUMBER)) +
  guides(fill = FALSE) +
  theme_void()

Hydro Data

hydro_dorian$STATION_NUMBER
hydro_data <- realtime_dd(station_number = hydro_dorian$STATION_NUMBER) %>% 
  filter(Parameter == "Level") 

hydro_data

Hydro Data

dorian_date <- as.POSIXct(paste0(dorian_canada$YEAR, dorian_canada$MONTH, dorian_canada$DAY, " 00:00:01"), "%Y%m%d %H:%M:%S", tz = "GMT")

hydro_data %>% 
  ggplot(aes(x = Date, y = Value, colour = STATION_NUMBER)) +
  geom_line() +
  geom_vline(xintercept = dorian_date, colour = "black", linetype = 2) +
  facet_wrap(~STATION_NUMBER, scales = "free_y") +
  labs(y = "Level (m)") +
  theme_minimal()

Climate Stations

ggplot() +
  geom_sf(data = maritimes, aes(fill = name), alpha = 0.3) +
  geom_sf(data = dorian_canada, colour = "purple", size = 3) +
  geom_sf(data = dorian_buffer, colour = "orange", alpha = 0.5) +
  geom_sf(data = climate_dorian, aes(colour = station_name)) +
  guides(fill = FALSE) +
  theme_void()

Climate Data

climate_dorian$station_id
climate_data <- weather_dl(station_ids = climate_dorian$station_id, 
                           start = "2019-09-01", interval = "hour", quiet = TRUE)

climate_data

Climate Data

ggplot(climate_data) +
  geom_line(aes(x = time, y = wind_spd, colour = station_name)) +
  geom_vline(xintercept = dorian_date, colour = "black", linetype = 2) +
  facet_wrap(~station_name) +
  labs(y = "Wind Speed (km/h)") +
  theme_minimal()

What else is available in R - ggplot2

library(ggplot2)

canada_stations <- hy_stations(prov_terr_state_loc = "CA") %>% 
  filter(DRAINAGE_AREA_GROSS < 10000)

ggplot(canada_stations, aes(x = DRAINAGE_AREA_GROSS, fill = HYD_STATUS)) +
  geom_density(alpha = 0.5) +
  labs(x = "Mean long term annual discharge (m^3)", y = "Gross drainage area (km^2)") +
  theme_minimal() +
  facet_wrap(~PROV_TERR_STATE_LOC, scales = "free_y")

What else is available in R - ggplot2

library(ggplot2)

canada_stations <- hy_stations(prov_terr_state_loc = "CA") %>% 
  filter(DRAINAGE_AREA_GROSS < 10000)

ggplot(canada_stations, aes(x = DRAINAGE_AREA_GROSS, fill = HYD_STATUS)) +
  geom_density(alpha = 0.5) +
  labs(x = "Gross drainage area (km^2)") +
  facet_wrap(~PROV_TERR_STATE_LOC, scales = "free_y") +
  theme_minimal()

It can be hard!


Resources for R

Drawing

Drawing

Drawing


class: inverse, left, middle

Reprex

Prepare Reproducible Example Code via the Clipboard

Drawing


Contribute to tidyhydat and weathercan

Openly developed on GitHub

https://github.com/ropensci/tidyhydat

https://github.com/ropensci/weathercan

.pull-left[ Any contribution helps. You don't have to be an R programmer! - Questions - Ideas / Feature requests - Bugs - Bug-fixes - Development ] .pull-right[

Drawing
Drawing
]


Ways to contribute

tidyhydat

weathercan


Ways to cite

📝Albers S (2017). “tidyhydat: Extract and Tidy Canadian Hydrometric Data.” The Journal of Open Source Software, 2(20). doi: 10.21105/joss.00511, http://dx.doi.org/10.21105/joss.00511.

📝LaZerte S, Albers S (2018). “weathercan: Download and format weather data from Environment and Climate Change Canada.” The Journal of Open Source Software, 3(22), 571. http://joss.theoj.org/papers/10.21105/joss.00571.

th <- citation("tidyhydat")
print(th, style = "html")
wc <- citation("weathercan")
print(wc, style = "html")

Drawing

class: center

Some Helpful Links

Intro R & RStudio: https://r4ds.had.co.nz

Getting started with tidyhydat: https://docs.ropensci.org/tidyhydat

Getting started with weathercan: https://ropensci.github.io/weathercan

Hydrology CRAN task view: https://CRAN.R-project.org/view=Hydrology

rOpenSci: https://ropensci.org

📝 But we all have to work in excel so read this: https://www.tandfonline.com/doi/full/10.1080/00031305.2017.1375989


class: basic background-image: url(https://media.giphy.com/media/TnDoEoXfT7YoE/giphy.gif) background-size: cover

Questions?

.content-box-blue[ Slides available from

.small[ https://github.com/ropensci/tidyhydat/blob/master/presentations/tidyhydat_weathercan/tidyhydat_weather.pdf https://github.com/ropensci/tidyhydat/blob/master/presentations/tidyhydat_weathercan/tidyhydat_weather.Rmd ] Contact sam.albers@gov.bc.ca ]



bcgov/tidyhydat documentation built on Jan. 15, 2024, 4:03 a.m.