R/fetch_refreshed_data.R

Defines functions fetch_refreshed_data

Documented in fetch_refreshed_data

#' Fetch a surveygizmo csv datadump, while making sure the returned data uses the most updated file
#'
#' @description A surveygizmo data dump can be refreshed by appending &realtime=TRUE, however, when such a method is used
#' the refresh is usually slow, and reading the file while it is being refreshed might lead to partial results.
#' The function uses deliberate pauses (waiting times) between reading attemps.
#' It continues to read the file as long as it is still being updated, in order to return the most updated version.
#' This function should work as long as the waiting time is long enough and that the file is no longer than 10k lines.
#' The 10k lines is a surveygizmo limitation (the &realtime=true works under 10k lines).
#' #'
#' @param link_str The link to the data dump
#' @param wait The waiting time between successive reading attempts
#' @param ... Additional arguments to pass to read_csv

#' @return A tibble with the file's contents.
#'
#' @examples
#' file_results <- fetch_refreshed_data("LINK_TO_FILE", wait = 5, ...)
#'
#' @export
fetch_refreshed_data <- function(link_str, wait=5, ...){

  realtime_link <- paste0(link_str,"&realtime=TRUE")

  just_refresh <- httr::GET(realtime_link)
  base_reading <- suppressWarnings(suppressMessages(readr::read_csv(link_str, ...)))


  nrow_prev <- NROW(base_reading)
  nrow_current <- nrow_prev+1

  while (nrow_prev<nrow_current) {
    cat("*")

    # if no update this is the finish condition:
    nrow_prev <- nrow_current

    # wait for update before next download
    Sys.sleep(wait)

    #next download:
    current_reading <- suppressWarnings(suppressMessages(readr::read_csv(link_str, ...)))
    nrow_current <- NROW(current_reading)

  }
  cat(paste0("\nLatest file pulled with ", NROW(current_reading), " rows."))

  return(current_reading)

}
sarid-ins/saridr documentation built on Nov. 10, 2020, 9:07 p.m.