library(async) library(desc) library(jsonlite)
Get the URLs of all US CRAN mirrors, we'll use this later.
mirrors <- getCRANmirrors() mirror_urls <- mirrors$URL[mirrors$CountryCode == "us"]
response_time()
will be an async function that returns the total
response time of HEAD request to a single URL.
http_stop_for_status()
is similar to httr::stop_for_status()
,
it throws an error if the status code is 400 or higher. Should
this (or other) error happen, we will just return Inf
for that
URL.
We also add the URL to the result, as a name.
response_time <- async(function(url) { http_head(url)$ then(http_stop_for_status)$ then(function(x) setNames(x[["times"]][["total"]], url))$ catch(error = function(e) setNames(Inf, url)) })
Let's try it, with a real CRAN URL, and a test one that returns an HTTP 404 status code.
synchronise(response_time(mirror_urls[[1]])) synchronise(response_time("https://httpbin.org/status/404"))
fastest_urls()
will run response_time()
on all supplied
URLs, and stop as soon as 10 of them respond. Then we
sort the results, so the fastest response is first.
fastest_urls <- async(function(urls) { reqs <- lapply(urls, response_time) when_some(10, .list = reqs)$ then(function(x) sort(unlist(x))) })
Let's run it on the real data.
synchronise(fastest_urls(mirror_urls))
By resolving we mean
1. getting the DESCRIPTION
file of the package to be able to
look up dependencies, and other metadata, and
2. getting a download link, in case we want to download the package later.
For this example we'll implement two package types: 1. package on GitHub, and 2. package file at a URL.
We start with GitHub packages, these are more complicated.
First, we need to set the Accept
HTTP header when we talk to
the GitHub API, so define a helper function for that. We also set
a GitHub token to avoid GitHub rate limits, this should be in the
GITHUB_PAT
environment variable.
default_gh_headers <- function() { headers <- c("Accept" = "application/vnd.github.v3+json") pat <- Sys.getenv("GITHUB_PAT", "") if (pat != "") { headers <- c(headers, Authorization = paste("token", pat)) } headers }
Getting the DESCRIPTION
file is easy, we just need to hit
the right API endpoint, and then extract the response.
desc::desc()
creates a nice desc::description
object that
can be queried.
We use http_stop_for_status()
to error out, if the any error
happens, e.g. the repo or the file does not exist, and we cannot
access it, etc. We don't catch this (or other) error here, so
get_gh_description()
will throw an error to its $then()
(or $when_any()
or $when_all()
) child, or to synchronise()
.
get_gh_description <- async(function(user, repo) { desc_url <- paste0( "https://raw.githubusercontent.com/", user, "/", repo, "/HEAD/DESCRIPTION") http_get(desc_url, headers = default_gh_headers())$ then(http_stop_for_status)$ then(function(resp) rawToChar(resp$content))$ then(function(txt) desc::desc(text = txt)) })
Test:
synchronise(get_gh_description("jeroen", "curl"))
Test non-existing repos, and other errors:
synchronise(get_gh_description("jeroen", "curlqwerty"))
synchronise(get_gh_description("git", "git"))
The next thing we need is the SHA of the commit we want to download. For simplicity we always use the last commit of default branch here. This function is very similar to the previous one otherwise.
get_gh_sha <- async(function(user, repo) { commit_url <- paste0( "https://api.github.com/repos/", user, "/", repo, "/git/trees/HEAD") http_get(commit_url, headers = default_gh_headers())$ then(http_stop_for_status)$ then(function(resp) { cdata <- jsonlite::fromJSON(rawToChar(resp$content), simplifyVector = FALSE) cdata$sha }) })
Test:
synchronise(get_gh_sha("jeroen", "curl"))
A small helper function to get a GitHub download URL from the user name, repo name and the sha hash:
get_gh_download_url <- function(user, repo, sha) { paste0("https://api.github.com/repos/", user, "/", repo, "/zipball/", sha) }
We are ready to write the GitHub resolver now. It will take a slug,
i.e. the user/repo
form, and then query both the DESCRIPTION
file and the SHA hash. when_all()
resolves if both of them are
done, and results a named list of both pieces of data.
Then we just create the download URL, and simply return it with the
already parsed DESCRIPTION
.
resolve_gh <- async(function(slug) { slug <- strsplit(slug, "/")[[1]] user <- slug[[1]] repo <- slug[[2]] desc <- get_gh_description(user, repo) sha <- get_gh_sha(user, repo) when_all(desc = desc, sha = sha)$ then(function(x) { list( url = get_gh_download_url(user, repo, x$sha), description = x$desc) }) })
Test:
synchronise(resolve_gh("jeroen/curl"))
Resolving a package at an HTTP URL is much simpler, it needs a single HTTP
request, because we simply need to download the package to get the
DESCRIPTION file. We place the downloaded package in a temporary file
(with the same file name), and return a file://
URL to this file.
resolve_url <- async(function(url) { dir.create(tmpdir <- tempfile()) dest <- file.path(tmpdir, basename(url)) http_get(url, file = dest)$ then(http_stop_for_status)$ then(function() { dest <- normalizePath(dest) list( url = paste("file://", normalizePath(dest)), description = desc::desc(dest)) }) })
Test:
curl20_url <- "https://cloud.r-project.org/src/contrib/Archive/curl/curl_2.0.tar.gz" synchronise(resolve_url(curl20_url))
Now combine the two functions:
res <- synchronise(when_all( resolve_gh("jeroen/curl"), resolve_gh("ropensci/magick"), resolve_url(curl20_url) )) length(res)
Errors bubble up to the top level, other downloads are cancelled
automatically (unless they are already completed of course), and
an error is thrown from synchronise()
.
tryCatch( res <- synchronise(when_all( resolve_gh("jeroen/curl"), resolve_gh("ropensci/magick"), resolve_url(curl20_url), resolve_url("https://httpbin.org/delay/2") )), error = function(e) e )
But we can also handle errors locally, at any point we wish.
E.g. we can simply return NULL
values for the failed packages.
res <- synchronise(when_all( resolve_gh("jeroen/curl"), resolve_gh("ropensci/magickfoooooobar")$catch(error = function(e) NULL), resolve_url(curl20_url), resolve_url("https://httpbin.org/status/401")$catch(error = function(e) NULL) )) res[[1]]$description$get("Package") res[[2]] res[[3]]$description$get("Version") res[[4]]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.