tidyindex

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(tidyindex)
library(dplyr)
library(lubridate)
library(lmomco)
library(ggplot2)
library(tsibble)

The tidyindex package provides functionality to construct indexes in a data pipeline, align with the tidyverse paradigm. The pipeline approach is universally applicable to indexes of all kinds. It allows indexes to be broken down into a set of defined building blocks (modules) and hence provides means to standardise the workflow to construct, compare, and analyse indexes.

Decomposing an index into steps

Here we present an example to calculate one of the most widely used drought index: Standardised Precipitation Index (SPI). The index is composed to three steps:

Pipeline design

These three steps correspond to three modules in the tidyindex pipeline (temporal_aggregate(), distribution_fit(), and normalise()). Each module uses a tidyverse-mutate style to calculate a step within the module. For example, the following code fits a gamma distribution to the variable .agg. Different distributions are available and prefixed with dist_*() and additional distribution can be added by the user following a similar style to the existing dist_*() steps. The step dist_*() can also be evaluated standalone and seen as a recipe of the step:

distribution_fit(.fit = dist_gamma(...))
dist_gamma(var = ".agg")

Standardised Precipitation Index (SPI): An example

Here we select a single station, Texas Post Office, where is heavily impacted during the 2019/20 bushfire season, in Queensland, Australia, to demonstrate the calculation.

texas_post_office <- queensland %>% 
  filter(name == "TEXAS POST OFFICE") %>% 
  mutate(month = lubridate::month(ym)) 

dt <- texas_post_office |>
  init(id = id, time = ym, group = month) |> 
  temporal_aggregate(.agg = temporal_rolling_window(prcp, scale = 24)) |> 
  distribution_fit(.fit = dist_gamma(var = ".agg")) |>
  tidyindex::normalise(.index = norm_quantile(.fit))
dt

The results contain a summary of the steps used and the data with intermediate variables (.agg, .fit, and .fit_obj) and the index (.index). We can plot the result using ggplot2 as:

dt$data |> 
  ggplot(aes(x = ym, y = .index)) + 
  geom_hline(yintercept = -2, color = "red",  linewidth = 1) + 
  geom_line() + 
  scale_x_yearmonth(name = "Year", date_break = "2 years", date_label = "%Y") +
   theme_bw() +
  facet_wrap(vars(name), ncol = 1) + 
  theme(panel.grid = element_blank(), 
        legend.position = "bottom") + 
  ylab("SPI")

What's more

There are many different things you can do with the package, for example:



Try the tidyindex package in your browser

Any scripts or data that you put into this service are public.

tidyindex documentation built on Nov. 16, 2023, 5:08 p.m.