knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

ghr

Travis build status Coveralls test coverage Lifecycle: experimental CRAN status

The files you need are often scattered across multiple GitHub organizations and repositories. How can you easily find and access them from R?

The ghr (GitHub-R) package helps you to explore and access GitHub directories from R, using a familiar syntax and interface.

Installation

install.packages("devtools")
devtools::install_github("maurolepore/ghr")

Example

Overview

library(purrr)
library(ghr)

# Familiar syntax, similar to the `repo` argument of `remotes::install_github()`
path <- "maurolepore/tor/inst/extdata/mixed@master"

# Familiar interface, similar to `fs::dir_ls()`
ghr_ls(path)
ghr_ls(path, regexp = "[.]csv$", invert = TRUE)

# Easily read data directly from GitHub into R
path %>% 
  ghr_ls_download_url(regexp = "[.]csv$") %>% 
  readr::read_csv()

Details

Use ghr_get() to get a GitHub-API response. Notice that the call is memoised.

system.time(ghr_get("maurolepore/ghr"))
# Takes no time because the first call is memoised
system.time(ghr_get("maurolepore/ghr"))

response <- ghr_get(path = "maurolepore/ghr")
class(response)

Use ghr_show_fields() to see what fields are available for a given path.

ghr_show_fields(response)

Use ghr_pull() to access specific fields of the GitHub response.

ghr_pull(response, "name")

Use ghr_ls(), ghr_ls_download_url() and ghr_ls_htrml_url() are shortcuts for ghr_pull(ghr_get(path), field = "<field>"). They offer an interface similar to fs::dir_ls().

path <- "maurolepore/tor/inst/extdata/mixed"
ghr_ls(path, regexp = "[.]csv$")
ghr_ls(path, regexp = "[.]csv$", invert = TRUE)

ghr_ls(path, regexp = "[.]RDATA$", ignore.case = FALSE)
ghr_ls(path, regexp = "[.]RDATA$", ignore.case = TRUE)

ghr_ls_download_url() and ghr_ls_htrml_url() are most useful in combination with purrr::map(), utils::browseURL() and reader functions such as read.csv().

"maurolepore/tor/inst/extdata/csv" %>% 
  ghr_ls_download_url() %>% 
  print() %>% 
  map(~ read.csv(.x, stringsAsFactors = FALSE))
html_urls <- "maurolepore/tor/inst/extdata/csv" %>% 
  ghr_ls_html_url()

if (interactive()) {
  html_urls %>% 
    map(~ browseURL(.x))
}

You can pass additional arguments to gh::gh() via ..., for example, if you need more items than fit in the default .limit per page.

# Default number of items per page is 30
length(ghr_ls("maurolepore"))
# All repos in of the user maurolepore
length(ghr_ls("maurolepore", .limit = Inf))

You can request information from a specific branch via the argument ref or with a path structured like this: owner/repo/subdir@branch

ghr_show_branches("maurolepore/ghr")

ghr_ls("maurolepore/ghr", ref = "gh-pages")
# Same
ghr_ls("maurolepore/ghr@gh-pages")

Information

Acknowledgments

Thanks to Gábor Csárdi et. al for the gh package and to Francois Michonneau, James Balamuta, Noam Ross and Bryce Mecum for sharing their ideas and code.



maurolepore/ghr documentation built on May 18, 2019, 12:26 p.m.