R/hydrometContinuous.R

Defines functions hydrometContinuous

Documented in hydrometContinuous

#' Continuous hydromet data plotting
#'
#' @description
#' `r lifecycle::badge('stable')`
#'
#' Generate plots of water level/flow, snow pillows, or other continuous data present in the local hydrometric database generated by package WRBdatabase. For any given start/end day of year. Integrates return periods if requested. Intended to supersede [WRBfloods::levelPlot()] and [WRBfloods::flowPlot()]: see details.
#'
#' Notice: in many cases, you're better off using the Shiny app at [WRBfloods::hydroApp()] to generate and export your plot. Read on if you need additional control over the final product.
#'
#' This function plots data from the local hydrometric database (maintained by the WRBdatabase package) and yields consistent-looking plots for continuous data. This function can only plot what's in the database, use the function [DB_browse_ts()] to see what's in there first. Plots will include bands representing the historic min, max, and 25th-75th percentiles based on the last year of data requested. If necessary for performance, data may be down-sampled to daily means: see details.
#'
#' @details
#' Superseded functions [WRBfloods::levelPlot()] and [WRBfloods::flowPlot()] can be used in the event that this function does not yield your desired graph, or if you cannot use the local database and must use data directly from the Water Survey of Canada.
#'
#' ## Down sampling of data
#' When requesting time periods where high-frequency (realtime) data is available, some of the data may be displayed as daily means instead to improve performance. Specifically, only day ranges of less than 60 days will have high-frequency data (61 days or more will be plotted with daily means). Additionally, daily means will be fetched once the number of high-frequency rows exceeds 20 000 after year specified under parameter `years` (this will plot up to two 60-day lines at 5 minute recording intervals). In all cases, high-frequency data will be fetched nearest to the specified `endDate`.
#'
#' ## Return periods:
#' Return periods generated using the "calculate" option are calculated using all available after applying the parameter `return_max_year`. Extreme values are isolated for each year and passed to [fasstr::compute_frequency_analysis()] using a Log Pearson Type 3 distribution and the method of moments to fit a curve to the distribution. Years (or rather periods within years; see below) missing more than the percentage of data specified in `allowed_missing` are excluded from consideration.
#'
#' As the time of year during which minimum or maximum values happen is variable and to prevent filtering out years that are missing irrelevant data (most rivers peak in spring, some peak in late summer, snow pack peaks in winter), the parameters return_months and allowed_missing should be carefully selected. This may require you to make a graph first to evaluate when the annual min/max of interest happens. Consider as well that you can ask for return periods specific for a month or season, such as return periods of minimum flows in September, or maximum flows in January. This also applies to levels or any other continuous parameter which this function can plot.
#'
#' If specifying the option "table", the function will attempt to find returns for the location within the package internal data. The returns contained there are from consultant work involving a detailed analysis of which data to keep/discard, as well as a human determination of the most appropriate distribution and curve fitting methods. The caveat here is that these tables are not necessarily up to date, that most locations have no calculated return periods, and that return periods only exist for water levels.
#'
#' @param location The location for which you want a plot.
#' @param parameter The parameter you wish to plot. The location:parameter combo must be in the local database.
#' @param startDay The start day of year for the plot x-axis. Can be specified as a number from 1 to 365, as a character string of form "yyyy-mm-dd", or as a date object. Either way the day of year is the only portion used, specify years to plot under parameter `years`.
#' @param endDay The end day of year for the plot x-axis. As per `startDay`.
#' @param tzone The timezone to use for graphing. Only really evident for a small number of days.
#' @param years The years to plot. If `startDay` and `endDay` cover December 31 - January 1, select the December year(s). Max 10 years, NULL = current year.
#' @param datum Should a vertical datum be applied to the data, if available? TRUE or FALSE.
#' @param title Should a title be included?
#' @param returns Should returns be plotted? You have the option of using pre-determined level returns only (option "table"), auto-calculated values(option "calculate"), "auto" (priority to "table", fallback to "calculate"), or "none". Defaults to "auto".
#' @param return_type Use minimum ("min") or maximum ("max") values for returns?
#' @param return_months Numeric vector of months during which to look for minimum or maximum values. Only works with calculated returns. Does not have to be within `startDay` and `endDay`, but will only consider data up to the last year specified in `years`. For months overlapping the new year like November-April, should look like c(11:12,1:4). IMPORTANT: the first month in the range should be the first element of the vector: c(1:4, 11:12) would not be acceptable. Think of it as defining a season. Passed to 'months' argument of [fasstr::calc_annual_extremes()] and also used to set the 'water_year_start' parameter of this function.
#' @param return_max_year The last year of data to consider when calculating returns. Default uses all years *prior to* the last year plotted, giving historical context. Cannot be greater than the last year plotted but can be earlier than the first.
#' @param allowed_missing Allowable % of data missing during the months specified in 'return_months' to still retain the year for analysis when calculating returns. Passed to 'allowed_missing' argument of [fasstr::calc_annual_extremes()].
#' @param plot_scale Adjusts/scales the size of plot text elements. 1 = standard size, 0.5 = half size, 2 = double the size, etc. Standard size works well in a typical RStudio environment.
#' @param save_path Default is NULL and the graph will be visible in RStudio and can be assigned to an object. Option "choose" brings up the File Explorer for you to choose where to save the file, or you can also specify a save path directly.
#' @param dbPath The path to the local hydromet database, passed to [hydroConnect()].
#'
#' @return A .png file of the plot requested (if a save path has been selected), plus the plot displayed in RStudio. Assign the function to a variable to also get a plot in your global environment as a ggplot object which can be further modified
#' @export
#'

hydrometContinuous <- function(location,
                               parameter,
                               startDay = 1,
                               endDay = 365,
                               tzone = "MST",
                               years = NULL,
                               datum = TRUE,
                               title = TRUE,
                               returns = "auto",
                               return_type = "max",
                               return_months = c(5:9),
                               return_max_year = max(years)-1,
                               allowed_missing = 10,
                               plot_scale = 1,
                               save_path = NULL,
                               dbPath = "default")
{

  # Commented code below is for testing...
  # location = "09EA004"
  # parameter = "flow"
  # startDay = "2022-09-01"
  # endDay = "2022-10-01"
  # tzone = "MST"
  # years = c(2022)
  # datum = TRUE
  # title = TRUE
  # returns = "calculate"
  # return_type = "max"
  # return_months = c(9,10)
  # return_max_year = max(years)-1
  # allowed_missing = 10
  # plot_scale = 1
  # save_path = NULL
  # dbPath <- "C:/Users/g_del/Documents/R/KlondikeLandslides/data/hydro.sqlite"

  # Checks on input parameters  and other start-up bits------------------
  if (parameter != "SWE"){
    parameter <- tolower(parameter)
  }
  return_type <- tolower(return_type)
  returns <- tolower(returns)
  if (!returns %in% c("table", "auto", "calculate", "none")){
    stop("Your entry for the parameter 'return' is invalid. Please review the function documentation and try again.")
  }

  if (is.null(years)){
    years <- as.numeric(substr(Sys.Date(), 1, 4))
  } else {
    years <- as.numeric(years)
    years <- sort(years)
    if (length(years) > 10){
      years <- years[(length(years)-10):length(years)]
      print("The parameter 'years' can only have up to 10 years. It's been truncated to the most recent 10 years.")
    }
  }
  # Select save path
  if (!is.null(save_path)){
    if (save_path %in% c("Choose", "choose")) {
      print("Select the folder where you want this graph saved.")
      save_path <- as.character(utils::choose.dir(caption="Select Save Folder"))
    }
  }

  if (return_max_year > max(years)){
    return_max_years <- max(years)
    message("Your parameter entry for 'return_max_years' is invalid (greater than the last year plotted). It has been adjusted to the last year plotted, or to the last year with enough data.")
  }

  #Connect
  con <- WRBtools::hydroConnect(path = dbPath, silent = TRUE)
  on.exit(DBI::dbDisconnect(con))

  #Confirm parameter and location exist in the database and that there is only one entry
  exist_check <- DBI::dbGetQuery(con, paste0("SELECT location, parameter FROM timeseries WHERE location = '", location, "' AND parameter = '", parameter, "' AND type = 'continuous'"))
  if (nrow(exist_check) == 0){
    stop(paste0("There doesn't appear to be a match in the database for location ", location, ", parameter ", parameter, ", and continuous data type."))
  } else if (nrow(exist_check) > 1){
    stop(paste0("There is more than one entry in the database for location ", location, ", parameter ", parameter, ", and continuous data type. Please alert the person responsible for database maintenance."))
  }

  # Dealing with start/end dates ----------------------
  # Sort out startDay and endDay into actual dates if needed
  last_year <- max(years)
  leap_list <- (seq(1800, 2100, by = 4))  # Create list of all leap years
  tryCatch({
    startDay <- as.character(startDay)
    startDay <- as.POSIXct(startDay, tz = tzone)
    lubridate::year(startDay) <- last_year
  }, error = function(e) {
    if (last_year %in% leap_list){
      if (startDay > 59){
        startDay <<- startDay + 1
      }
    }
    startDay <<- as.POSIXct(as.numeric(startDay)*60*60*24, origin = paste0(last_year-1, "-12-31"), tz = "UTC")
    startDay <<- lubridate::force_tz(startDay, tzone)
  })
  tryCatch({
    endDay <- as.character(endDay)
    endDay <- as.POSIXct(endDay, tz = tzone)
    lubridate::year(endDay) <- last_year
  }, error = function(e) {
    tempStartDay <- lubridate::yday(startDay) #using yday because start is now in proper Date format and needs to be back-converted to yday
    if (last_year %in% leap_list){
      if (endDay > 59){
        endDay <<- endDay + 1
      }
    }
    endDay <<- as.POSIXct(as.numeric(endDay)*60*60*24, origin = paste0(last_year-1, "-12-31 23:59:59"), tz = "UTC")
    endDay <<- lubridate::force_tz(endDay, tzone)
  })
  if (startDay > endDay){ #if the user is wanting a range overlapping the new year
    lubridate::year(endDay) <- lubridate::year(endDay)+1
    overlaps <- TRUE
  } else {
    overlaps <- FALSE
  }
  day_seq <- seq.POSIXt(startDay, endDay, by = "day")

  # Find the necessary datum
  if (datum & parameter == "level"){
    datum <- DBI::dbGetQuery(con, paste0("SELECT * FROM datum_conversions WHERE location = '", location, "'"))
    datum <- datum[datum$current == TRUE , ]
    datum_name <- DBI::dbGetQuery(con, paste0("SELECT datum_name_en FROM datum_list WHERE datum_id = ", datum$datum_id_to))
  } else {
    datum <- data.frame(conversion_m = 0)
  }

  #Find the ts units
  units <- DBI::dbGetQuery(con, paste0("SELECT units FROM timeseries WHERE parameter = '", parameter, "' AND location = '", location, "'"))

  # Get the necessary data -------------------
  realtime <- data.frame()
  dates <- lubridate::POSIXct(tz = tzone) #creates empty posixct vector
  daily_end <- endDay
  if (overlaps){
    lubridate::year(daily_end) <- last_year + 1
  } else {
    lubridate::year(daily_end) <- last_year
  }
  daily_end <- daily_end + 60*60*24
  daily <- DBI::dbGetQuery(con, paste0("SELECT * FROM daily WHERE location = '", location, "' AND parameter = '", parameter, "' AND date <= '", as.character(daily_end), "'")) #Everything daily is fetched as this is necessary to calculate return periods. This *could* be truncated if calculated returns are not called or not a possibility because returns == 'table'...
  daily$date <- as.POSIXct(daily$date, tz = tzone) #to posixct and not date so that it plays well with realtime df
  names(daily)[names(daily) == "date"] <- "datetime_UTC"
  for (i in rev(years)){ #Using rev so that the most recent year gets realtime, if possible
    start <- as.POSIXct(paste0(i, substr(startDay, 5, 16)), tz = tzone)
    start_UTC <- start
    attr(start_UTC, "tzone") <- "UTC"
    end <- as.POSIXct(paste0(i, substr(endDay, 5, 10), " 23:59:59"), tz = tzone)
    if (overlaps){
      lubridate::year(end) <- lubridate::year(end) + 1
    }
    end_UTC <- end
    attr(end_UTC, "tzone") <- "UTC"
    if (length(day_seq) < 90){
      if (nrow(realtime) < 20000){
        new_realtime <- DBI::dbGetQuery(con, paste0("SELECT * FROM realtime WHERE location = '", location, "' AND parameter = '", parameter, "' AND datetime_UTC BETWEEN '", as.character(start_UTC), "' AND '", as.character(end_UTC), "'")) #SQL BETWEEN is inclusive
        new_realtime <- utils::tail(new_realtime, 30000) #Retain only max 20000 data points for plotting performance
        new_realtime$datetime_UTC <- as.POSIXct(new_realtime$datetime_UTC, tz = "UTC") #Comes out of the DB as.character, so convert
        attr(new_realtime$datetime_UTC, "tzone") <- tzone
        realtime <- realtime[!is.na(realtime$value) , ] #remove empty data points, if any exist. These are later filled with NAs (1 per day) so the lines don't go across.
        realtime <- rbind(realtime, new_realtime)
        if (nrow(new_realtime) > 0){
          new_dates <- seq.POSIXt(start, end, by = "days")
          dates <- c(dates, new_dates)
          get_daily <- FALSE
        } else {
          get_daily <- TRUE
        }
      } else {
        get_daily <- TRUE
      }
    } else {
      get_daily <- TRUE
    }
    if (get_daily) {
      new_realtime <- daily[daily$datetime_UTC >= start & daily$datetime_UTC <= end , ]
      new_realtime <- new_realtime[ , c("location", "parameter", "datetime_UTC", "value", "grade", "approval")]
      realtime <- rbind(realtime, new_realtime)
    }
  }
  #Find out where values need to be filled in with daily means
  if (length(dates) > 0){
    for (i in 1:length(dates)){
      toDate <- as.Date(dates[i]) #convert to plain date to check if there are any datetimes with that plain date
      if (!(toDate %in% as.Date(realtime$datetime_UTC))){
        row <- daily[daily$datetime_UTC == dates[i] , c("location", "parameter", "datetime_UTC", "value", "grade", "approval")]
        realtime <- rbind(realtime, row)
      }
    }
  }
  if (nrow(realtime) == 0){
    stop("There is no data to plot within this range of years and days.")
  }
  #Add the ribbon values for the times between startDay and endDay
  ribbon <- data.frame()
  ribbon_seq <- seq.Date(min(as.Date(day_seq)), max(as.Date(day_seq)+1), by = "day") #Gets extended a day so that the ribbon matches the full length of the data (if high-res data is acquired, data points will go the 23:59 the next day but the ribbon ends at 00:00)
  for (i in 1:length(ribbon_seq)){
    target_date <- ribbon_seq[i]
    if (!(lubridate::month(target_date) == 2 & lubridate::day(target_date) == 29)){
      if (overlaps){
        row <- daily[daily$datetime_UTC == target_date, !names(daily) %in% c("value", "grade", "approval")]
        if (nrow(row) == 0){
          lubridate::year(target_date) <- lubridate::year(target_date) -1
          row <- daily[daily$datetime_UTC == target_date, !names(daily) %in% c("value", "grade", "approval")]
        }
      } else {
        row <- daily[daily$datetime_UTC == target_date, !names(daily) %in% c("value", "grade", "approval")]
        if (nrow(row) == 0){
          prev_date <- target_date
          lubridate::year(prev_date) <- last_year - 1
          row <- daily[daily$datetime_UTC == prev_date, !names(daily) %in% c("value", "grade", "approval")]
          lubridate::year(row$datetime_UTC) <- last_year
        }
        if (i == length(ribbon_seq)){
          row$datetime_UTC <- row$datetime_UTC - 1 #This keeps the last ribbon point within the same days as the day sequence requested. Without this, a last day requested of 365 causes a point to show up in the following year.
        }
      }
      if (nrow(row) > 0){
        ribbon <- rbind(ribbon, row)
      }
    }
  }
  realtime <- merge(realtime, ribbon, all = TRUE)

  realtime$year <- lubridate::year(realtime$datetime_UTC) #year, month columns used for removing Feb 29 later
  realtime$month <- lubridate::month(realtime$datetime_UTC)
  realtime$day <- lubridate::day(realtime$datetime_UTC)
  realtime <- realtime[!(realtime$month == 2 & realtime$day == 29), ] #Remove Feb 29
  daily$year <- lubridate::year(daily$datetime_UTC)
  daily$month <- lubridate::month(daily$datetime_UTC)
  daily$day <- lubridate::month(daily$datetime_UTC)
  daily <- daily[!(daily$month == 2 & daily$day == 29), ] #Remove Feb 29

  if (overlaps){ # This section sorts out the overlap years, builds the plotting column
    temp <- data.frame(date = day_seq)
    temp$year = lubridate::year(temp$date)
    temp <- temp[!(temp$year == max(temp$year)), ] #Remove the rows for days after the new year
    temp$month = lubridate::month(temp$date)
    temp$day = lubridate::day(temp$date)
    temp$day <- stringr::str_pad(temp$day, 2, side = "left", pad = "0")

    #Column md is built in both temp and realtime dfs to be able to differentiate the previous year from the next and assign proper plot years (i.e. 2022-2023) and fake datetimes (since every year needs the same "fake year" to plot together)
    temp$md <- paste0(temp$month, temp$day)
    temp$md <- as.numeric(temp$md)
    md_sequence <- seq(min(temp$md), max(temp$md))

    realtime$day <- stringr::str_pad(realtime$day, 2, side = "left", pad = "0")
    realtime$md <- paste0(realtime$month, realtime$day)
    realtime$md <- as.numeric(realtime$md)

    realtime$fake_datetime <- as.POSIXct(rep(NA, nrow(realtime)))
    realtime$plot_year <- NA
    for (i in 1:nrow(realtime)){  #!!!This desperately needs to be vectorized in some way. Super slow!
      fake_datetime <- gsub("[0-9]{4}", if (realtime$md[i] %in% md_sequence) last_year else last_year + 1, realtime$datetime_UTC[i])
      fake_datetime <- ifelse(nchar(fake_datetime) > 11, fake_datetime, paste0(fake_datetime, " 00:00:00"))
      realtime$fake_datetime[i] <- as.POSIXct(fake_datetime, tz = tzone)
      realtime$plot_year[i] <- if (realtime$md[i] %in% md_sequence) paste0(realtime$year[i], "-", realtime$year[i] +1) else paste0(realtime$year[i] -1, "-", realtime$year[i])
    }
  } else { #Does not overlap the new year
    realtime$plot_year <- as.character(realtime$year)
    realtime$fake_datetime <- gsub("[0-9]{4}", last_year, realtime$datetime_UTC)
    realtime$fake_datetime <- ifelse(nchar(realtime$fake_datetime) > 11, realtime$fake_datetime, paste0(realtime$fake_datetime, " 00:00:00"))
    realtime$fake_datetime <- as.POSIXct(realtime$fake_datetime, tz = tzone, format='%Y-%m-%d %H:%M:%S') #Make fake datetimes to permit plotting years together as separate lines. This DOESN'T work if Feb 29 isn't removed first!
  }

  # apply datum correction where necessary
  if (datum$conversion_m > 0){
    daily[ , c("value", "max", "min", "QP90", "QP75", "QP50", "QP25", "QP10")] <- apply(daily[ , c("value", "max", "min", "QP90", "QP75", "QP50", "QP25", "QP10")], 2, function(x) x + datum$conversion_m)
    realtime[ , c("value", "max", "min", "QP90", "QP75", "QP50", "QP25", "QP10")] <- apply(realtime[ , c("value", "max", "min", "QP90", "QP75", "QP50", "QP25", "QP10")], 2, function(x) x + datum$conversion_m)
  }

  # Make the plot --------------------
  colours = c("blue", "black", "darkorchid3", "cyan2", "firebrick3", "aquamarine4", "gold1", "chartreuse1", "darkorange", "lightsalmon4")
    # c("black", "#DC4405", "#773F65", "#F2A900", "#244C5A", "#C60D58", "#687C04", "#0097A9", "#7A9A01", "#CD7F32")
  line_size = 1
  minHist <- min(realtime$min, na.rm=TRUE)
  maxHist <- max(realtime$max, na.rm=TRUE)
  minLines <- min(realtime$value, na.rm=TRUE)
  maxLines <- max(realtime$value, na.rm=TRUE)
  min <- if (minHist < minLines) minHist else minLines
  max <- if (maxHist > maxLines) maxHist else maxLines

  # x axis settings
  if (length(day_seq) > 60) {
    date_breaks = "1 month"
    labs = scales::label_date("%b %d", tz=Sys.timezone())
  } else if (length(day_seq) > 14) {
    date_breaks="1 week"
    labs = scales::label_date("%b %d", tz = Sys.timezone())
  } else if (length(day_seq) > 7) {
    date_breaks="2 days"
    labs=scales::label_date("%b %d", tz = Sys.timezone())
  } else if (length(day_seq) >= 2) {
    date_breaks="1 days"
    labs=scales::label_date("%b %d", tz = Sys.timezone())
  } else if (length(day_seq) > 1){
    date_breaks="24 hours"
    labs=scales::label_time("%H:%M", tz = Sys.timezone())
  } else if (length(day_seq) == 1) {
    date_breaks="12 hour"
    labs=scales::label_time(format="%H:%M", tz = Sys.timezone())
  }

  legend_length <- length(unique(realtime$plot_year))

  plot <- ggplot2::ggplot(realtime, ggplot2::aes(x = fake_datetime, y = value)) +
    ggplot2::ylim(min, max) +
    ggplot2::scale_x_datetime(date_breaks = date_breaks, labels = labs) +
    ggplot2::labs(x = "", y =  paste0((if (parameter != "SWE") stringr::str_to_title(parameter) else parameter), " (", units, ")")) +
    ggplot2::theme_classic() +
    ggplot2::theme(legend.position = "right", legend.justification = c(0, 0.95), legend.text = ggplot2::element_text(size = 8*plot_scale), legend.title = ggplot2::element_text(size = 9*plot_scale), axis.title.y = ggplot2::element_text(size = 12*plot_scale), axis.text.x = ggplot2::element_text(size = 9*plot_scale), axis.text.y = ggplot2::element_text(size = 9*plot_scale))

  if (!is.infinite(minHist)){
    plot <- plot +
      ggplot2::geom_ribbon(ggplot2::aes(ymin = min, ymax = max, fill = "Min - Max"), na.rm = T) +
      ggplot2::geom_ribbon(ggplot2::aes(ymin = QP25, ymax = QP75, fill = "25th-75th Percentile"), na.rm = T) +
      ggplot2::scale_fill_manual(name = "Historical Range", values = c("Min - Max" = "gray85", "25th-75th Percentile" = "gray65"))
  }

  plot <- plot +
    ggplot2::geom_line(ggplot2::aes(colour = as.factor(plot_year), group = as.factor(plot_year)), linewidth = line_size, na.rm = T) +
    ggplot2::scale_colour_manual(name = "Year", labels = rev(unique(realtime$plot_year)), values = colours[1:legend_length], na.translate = FALSE, breaks=rev(unique(realtime$plot_year)))


  # Get or calculate return periods -------------
  if (returns != "none"){
    if (returns %in% c("table", "auto")){
      #search for the location in the table
      returns_table <- paste0(parameter, "_returns_", return_type)
      if (location %in% data[[returns_table]]$ID){
        returns <- "table"
        loc_returns <- data[[returns_table]][data[[returns_table]]$ID == location , ]
        loc_returns[ , c("twoyear", "fiveyear", "tenyear", "twentyyear", "fiftyyear", "onehundredyear", "twohundredyear", "fivehundredyear", "thousandyear", "twothousandyear", "LSL", "FSL")] <- apply(loc_returns[ , c("twoyear", "fiveyear", "tenyear", "twentyyear", "fiftyyear", "onehundredyear", "twohundredyear", "fivehundredyear", "thousandyear", "twothousandyear", "LSL", "FSL")], 2, function(x) x + datum$conversion_m)
        loc_returns[is.na(loc_returns)==TRUE] <- -10 #This prevents a ggplot error when it tries to plot a logical along with numerics, but keeps the values out of the plot.

        plot <- plot +
          ggplot2::geom_hline(yintercept=loc_returns$twoyear, linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=loc_returns$fiveyear, linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=loc_returns$tenyear, linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=loc_returns$twentyyear, linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=loc_returns$fiftyyear, linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=loc_returns$onehundredyear, linetype="dashed", color="black") +
          ggplot2::geom_hline(yintercept=loc_returns$twohundredyear, linetype="dashed", color="black") +
          ggplot2::geom_hline(yintercept=loc_returns$fivehundredyear, linetype="dashed", color="black") +
          ggplot2::geom_hline(yintercept=loc_returns$thousandyear, linetype="dashed", color="black") +
          ggplot2::geom_hline(yintercept=loc_returns$twothousandyear, linetype="dashed", color="black") +
          ggplot2::annotate("text", x=mean(realtime$fake_datetime), y=c(loc_returns$twoyear, loc_returns$fiveyear, loc_returns$tenyear, loc_returns$twentyyear, loc_returns$fiftyyear, loc_returns$onehundredyear, loc_returns$twohundredyear, loc_returns$fivehundredyear, loc_returns$thousandyear, loc_returns$twothousandyear), label= c("two year return", "five year return", "ten year return", "twenty year return", "fifty year return", "one hundred year return", "two hundred year return", "five hundred year return", "one-thousand year return", "two-thousand year return"), size=2.6*plot_scale, vjust=-.2*plot_scale)
      } else if (returns == "auto"){ # if there is no entry to the table AND the user specified auto, calculate loop will run after this
        returns <- "calculate"
      }
    }
    if (returns == "calculate"){
      tryCatch({
        extremes <- suppressWarnings(fasstr::calc_annual_extremes(daily[daily$year <= return_max_year , ], dates = datetime_UTC, values = value, water_year_start = return_months[1], months = return_months, allowed_missing = allowed_missing))
        extremes$Measure <- "1-Day"
        if (return_type == "max"){
          analysis <- fasstr::compute_frequency_analysis(data = extremes, events = Year, values = "Max_1_Day", use_max = TRUE, fit_quantiles = c(0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001, 0.0005))
          return_yrs <- c(min(lubridate::year(analysis$Freq_Analysis_Data$Max_1_Day_Date)), max(lubridate::year(analysis$Freq_Analysis_Data$Max_1_Day_Date)))
        } else if (return_type == "min"){
          analysis <- fasstr::compute_frequency_analysis(data = extremes, events = Year, values = "Min_1_Day", use_max = FALSE, fit_quantiles = c(0.5, 0.2, 0.1, 0.05, 0.02, 0.01, 0.005, 0.002, 0.001, 0.0005))
          return_yrs <- c(min(lubridate::year(analysis$Freq_Analysis_Data$Min_1_Day_Date)), max(lubridate::year(analysis$Freq_Analysis_Data$Min_1_Day_Date)))
        }
        freq <- analysis$Freq_Fitted_Quantiles

        plot <- plot +
          ggplot2::geom_hline(yintercept=as.numeric(freq[10,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[9,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[8,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[7,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[6,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[5,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[4,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[3,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[2,4]), linetype="dashed", color = "black") +
          ggplot2::geom_hline(yintercept=as.numeric(freq[1,4]), linetype="dashed", color = "black") +
          ggplot2::annotate("text", x=mean(realtime$fake_datetime), y=c(as.numeric(freq[10,4]), as.numeric(freq[9,4]), as.numeric(freq[8,4]), as.numeric(freq[7,4]), as.numeric(freq[6,4]), as.numeric(freq[5,4]), as.numeric(freq[4,4]), as.numeric(freq[3,4]), as.numeric(freq[2,4]), as.numeric(freq[1,4])), label= c("two year return", "five year return", "ten year return", "twenty year return", "fifty year return", "one hundred year return", "two hundred year return", "five hundred year return", "one-thousand year return", "two-thousand year return"), size=2.6*plot_scale, vjust=-.2*plot_scale)
      }, error = function(e){
        returns <<- "failed"
      })
    }
  }
  #Add some information below the legend
  spread <- max-min
  end_time <- max(realtime$fake_datetime)
  if (!is.infinite(minHist)){
    line1 <- paste0("\n         \n         \n        Historical range based\n        on years ", lubridate::year(min(daily$datetime_UTC)), " to ", lubridate::year(max(realtime$datetime_UTC))-1, "." )
  } else {
    line1 <- "\n         \n         \n        Not enough data for\n        historical ranges"
    plot <- plot + #Adjust the legend spacing so that the text isn't pushed off the plot area
      ggplot2::theme(legend.box.spacing = ggplot2::unit(40, "pt"))
  }
  if (returns == "calculate"){
    line2 <- paste0("        \n        \n        Return periods calculated\n        using months ", month.abb[return_months[1]], " to ",  month.abb[return_months[length(return_months)]], " \n        and years ", return_yrs[1], " to ", return_yrs[2], ". \n        ", nrow(analysis$Freq_Analysis_Data), " data points retained after\n        removing years with > ", allowed_missing, " %\n        missing data.")
    lines <- paste0(line1, line2)
    plot <- plot +
      ggplot2::coord_cartesian(clip="off", default=TRUE) +
      ggplot2::annotation_custom(grid::textGrob(lines, gp = grid::gpar(fontsize=8*plot_scale), just="left"), xmin=end_time, ymin = (max-spread/2)-8*spread/30, ymax =(max-spread/2)-8*spread/30)
  } else if (returns == "table"){
    line2 <- "        \n        \n        Return periods are based\n        on statistical analysis\n        of select data from the\n        start of records to 2021."
    lines <- paste0(line1, line2)
    plot <- plot +
      ggplot2::coord_cartesian(clip="off", default=TRUE) +
      ggplot2::annotation_custom(grid::textGrob(lines, gp = grid::gpar(fontsize=8*plot_scale), just="left"), xmin=end_time, ymin = (max-spread/2)-8*spread/30, ymax =(max-spread/2)-8*spread/30)
  } else if (returns == "failed") {
    line2 <- "        \n        \n        Insufficient data to \n        calculate returns using\n        last requested year."
    lines <- paste0(line1, line2)
    plot <- plot +
      ggplot2::coord_cartesian(clip="off", default=TRUE) +
      ggplot2::annotation_custom(grid::textGrob(lines, gp = grid::gpar(fontsize=8*plot_scale), just="left"), xmin=end_time, ymin = (max-spread/2)-8*spread/30, ymax =(max-spread/2)-8*spread/30)

  } else {
    plot <- plot +
      ggplot2::coord_cartesian(clip="off", default=TRUE) +
      ggplot2::annotation_custom(grid::textGrob(line1, gp = grid::gpar(fontsize=8*plot_scale), just = "left"), xmin=end_time, ymin = (max-spread/2)-7*spread/30, ymax=(max-spread/2)-7*spread/30)
  }

  # Wrap things up and return() -----------------------
  if (title == TRUE){
    stn_name <- DBI::dbGetQuery(con, paste0("SELECT name FROM locations where location = '", location, "'"))
    plot <- plot +
      ggplot2::labs(title=paste0("Location ", location, ": ", stn_name)) +
      ggplot2::theme(plot.title=ggplot2::element_text(hjust=0.05, size=14*plot_scale))
  }

  #Save it if requested
  if (!is.null(save_path)){
    ggplot2::ggsave(filename=paste0(save_path,"/", location, "_", parameter, "_", Sys.Date(), "_", lubridate::hour(as.POSIXct(format(Sys.time()), tz=tzone)), lubridate::minute(as.POSIXct(format(Sys.time()), tz=tzone)), ".png"), plot=plot, height=8, width=12, units="in", device="png", dpi=500)
  }

  return(plot)
}
YukonWRB/WRBplots documentation built on Sept. 29, 2023, 2:05 p.m.