R/source_GitHubData.R

Defines functions source_GitHubData

#' Load plain-text data from GitHub
#'
#' \code{source_GitHubData} loads plain-text formatted data stored on GitHub (and other secure-https-websites) into R. NOTE: this command is depricated. Use \link{source_data} instead.
#' @param url The plain-text formatted data's RAW URL.
#' @param sha1 Character string of the file's SHA-1 hash, generated by \code{source_data}.
#' @param sep The separator method for the data. By default \code{sep = ","} to load comma-separated values data (CSV). To load tab-separated values data (TSV) use \code{sep = "\t"}.
#' @param header whether or not the first line of the file is the header (i.e. variable names). The default is \code{header = TRUE}
#' @param ... additional arguments passed to \code{\link{read.table}}.
#' @return a data frame
#' @details Loads plain-text data (e.g. CSV, TSV) data from GitHub into R.
#' The function is basically the same as \code{\link{source_data}}, but with defaults choosen to make loading CSV files easier.
#' Note: the GitHub URL you give for the \code{url} argument must be for the RAW version of the file. The function should work to download plain-text data from any secure URL (https), though I have not verified this.
#'
#' From the source_url documentation: "If a SHA-1 hash is specified with the sha1 argument, then this function will check the SHA-1 hash of the downloaded file to make sure it matches the expected value, and throw an error if it does not match. If the SHA-1 hash is not specified, it will print a message displaying the hash of the downloaded file. The purpose of this is to improve security when running remotely-hosted code; if you have a hash of the file, you can be sure that it has not changed."
#' @examples
#' # Download electoral disproportionality data stored on GitHub
#' # Note: Using shortened URL created by bitly
#' \dontrun{
#' DisData <- source_GitHubData("http://bit.ly/Ss6zDO")
#' }
#' @source Based on source_url from the Hadley Wickham's devtools package.
#' @seealso \link{httr} and \code{\link{read.table}}
#' @importFrom digest digest
#' @importFrom httr GET stop_for_status text_content
#' @importFrom utils read.table
#' @keywords depricated
#' @noRd

source_GitHubData <-function(url, sha1 = NULL, sep = ",", header = TRUE, ...)
{
  warning('source_GitHubData is depricated. Use source_data instead.')
    stopifnot(is.character(url), length(url) == 1)

    temp_file <- tempfile()
    on.exit(unlink(temp_file))

    request <- GET(url)
    stop_for_status(request)
    writeBin(content(request, type = "raw"), temp_file)

    file_sha1 <- digest(file = temp_file, algo = "sha1")

    if (is.null(sha1)) {
        message("SHA-1 hash of file is ", file_sha1)
    }
    else {
        if (!identical(file_sha1, sha1)) {
            stop("SHA-1 hash of downloaded file (", file_sha1,
                ")\n  does not match expected value (", sha1,
                ")", call. = FALSE)
        }
    }

	read.table(temp_file, sep = sep, header = header, ...)
}

Try the repmis package in your browser

Any scripts or data that you put into this service are public.

repmis documentation built on May 2, 2019, 12:48 a.m.