Creating your own API should be a matter of consulting the Google API documentation, and filling in the required details.
gar_api_generator()
has these components:
baseURI
- all APIs have a base for every API callhttp_header
- what type of request, most common are GET and POSTpath_args
- some APIs need you to alter the URL folder structure when calling, e.g. /account/{accountId}/
where accountId
is variable.pars_args
- other APIS require you to send URL parameters e.g. ?account={accountId}
where accountId
is variable.data_parse_function
- [optional] If the API call returns data, it will be available in $content
. You can create a parsing function that transforms it in to something you can work with (for instance, a dataframe)Example below for generating a function:
f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "POST", data_parse_function = function(x) x$id)
The function generated uses path_args
and pars_args
to create a template, but when the function is called you will want to pass dynamic data to them. This is done via the path_arguments
and pars_arguments
parameters.
path_args
and pars_args
and path_arguments
and pars_arguments
all accept named lists.
If a name in path_args
is present in path_arguments
, then it is substituted in. This way you can pass dynamic parameters to the constructed function. Likewise for pars_args
and pars_arguments
.
## Create a function that requires a path argument /accounts/{accountId} f <- gar_api_generator("https://www.googleapis.com/example", "POST", path_args = list(accounts = "defaultAccountId") data_parse_function = function(x) x$id) ## When using f(), pass the path_arguments function to it ## with the same name to modify "defaultAccountId": result <- f(path_arguments = list(accounts = "myAccountId"))
A lot of Google APIs look for you to send data in the Body of the request. This is done after you construct the function.
googleAuthR
uses httr
's JSON parsing via jsonlite
to construct JSON from R lists.
Construct your list, then use jsonlite::toJSON
to check if its in the correct format as specified by the Google documentation. This is often the hardest part using the API.
To aid debugging use the option(googleAuthR.verbose = 0)
to see all the sent and recieved HTTP requests, and also write what was sent as JSON in the body is written to a file called request_debug.rds
in the working directory.
Example:
library(googleAuthR) library(googleAnalyticsR) options(googleAuthR.verbose = 0) ga_auth() blah <- google_analytics_4(1212121, date_range = c(Sys.Date() - 7, Sys.Date()), metrics = "sessions") Calling APIv4.... Single v4 batch Token exists. Valid local token Request: https://analyticsreporting.googleapis.com/v4/reports:batchGet/ Body JSON parsed to: {"reportRequests":[{"viewId":"ga:121211","dateRanges":[{"startDate":"2017-01-06","endDate":"2017-01-13"}],"samplingLevel":"DEFAULT","metrics":[{"expression":"ga:sessions","alias":"sessions","formattingType":"METRIC_TYPE_UNSPECIFIED"}],"pageToken":"0","pageSize":1000,"includeEmptyRows":true}]} -> POST /v4/reports:batchGet/ HTTP/1.1 -> Host: analyticsreporting.googleapis.com -> User-Agent: googleAuthR/0.4.0.9000 (gzip) -> Accept: application/json, text/xml, application/xml, */* -> Content-Type: application/json -> Accept-Encoding: gzip -> Authorization: Bearer ya29XXXXX_EhpEot1ZPNP28MUmSz5EyQ7lY3kgNCFEefYv-Zof3a1RSwezgMJ5llCO44TA9iHi51c -> Content-Length: 295 -> >> {"reportRequests":[{"viewId":"ga:1212121","dateRanges":[{"startDate":"2017-01-06","endDate":"2017-01-13"}],"samplingLevel":"DEFAULT","metrics":[{"expression":"ga:sessions","alias":"sessions","formattingType":"METRIC_TYPE_UNSPECIFIED"}],"pageToken":"0","pageSize":1000,"includeEmptyRows":true}]} <- HTTP/1.1 200 OK <- Content-Type: application/json; charset=UTF-8 <- Vary: Origin <- Vary: X-Origin <- Vary: Referer <- Content-Encoding: gzip <- Date: Fri, 13 Jan 2017 10:45:38 GMT <- Server: ESF <- Cache-Control: private <- X-XSS-Protection: 1; mode=block <- X-Frame-Options: SAMEORIGIN <- X-Content-Type-Options: nosniff <- Alt-Svc: quic=":443"; ma=2592000; v="35,34" <- Transfer-Encoding: chunked <- Downloaded [1] rows from a total of [1]. > readRDS("request_debug.rds") $url [1] "https://analyticsreporting.googleapis.com/v4/reports:batchGet/" $request_type [1] "POST" $body_json {"reportRequests":[{"viewId":"ga:1212121","dateRanges":[{"startDate":"2017-01-06","endDate":"2017-01-13"}],"samplingLevel":"DEFAULT","metrics":[{"expression":"ga:sessions","alias":"sessions","formattingType":"METRIC_TYPE_UNSPECIFIED"}],"pageToken":"0","pageSize":1000,"includeEmptyRows":true}]}
Not all API calls return data, but if they do:
If you have no data_parse_function
then the function returns the whole request object. The content is available in $content
. You can then parse this yourself, or pass a function in to do it for you.
If you parse in a function into data_parse_function
, it works on the response's $content
.
Example below of the differences between having a data parsing function and not:
## the body object that will be passed in body = list( longUrl = "http://www.google.com" ) ## no data parsing function f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "POST") no_parse <- f(the_body = body) ## parsed data, only taking request$content$id f2 <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "POST", data_parse_function = function(x) x$id) parsed <- f2(the_body = body) ## str(no_parse) has full details of API response. ## just looking at no_parse$content as this is what API returns > str(no_parse$content) List of 3 $ kind : chr "urlshortener#url" $ id : chr "http://goo.gl/ZwT9pG" $ longUrl: chr "http://www.google.com/" ## compare to the above - equivalent to no_parse$content$id > str(parsed) chr "http://goo.gl/mCYw2i"
The response is turned from JSON to a dataframe if possible, via jsonlite::fromJSON
In some cases you may want to skip all parsing of API content, perhaps if it is not JSON or some other reason.
For these cases, you can use the option option("googleAuthR.rawResponse" = TRUE)
to skip all tests and return the raw response.
Here is an example of this from the googleCloudStorageR library:
gcs_get_object <- function(bucket, object_name){ ## skip JSON parsing on output as we epxect a CSV options(googleAuthR.rawResponse = TRUE) ## do the request ob <- googleAuthR::gar_api_generator("https://www.googleapis.com/storage/v1/", path_args = list(b = bucket, o = object_name), pars_args = list(alt = "media")) req <- ob() ## set it back to FALSE for other API calls. options(googleAuthR.rawResponse = FALSE) req }
If you are doing many API calls, you can speed this up a lot by using the batch option.
This takes the API functions you have created and wraps them in the gar_batch
function to request them all in one POST call. You then recieve the responses in a list.
Note that this does not count as one call for API limits purposes, it just speeds up the processing.
The example below queries from two different APIs and returns them in a list: It lists websites in your Google Search Console, and shows your goo.gl link history.
## from search console API list_websites <- function() { l <- gar_api_generator("https://www.googleapis.com/webmasters/v3/sites", "GET", data_parse_function = function(x) x$siteEntry) l() } ## from goo.gl API user_history <- function(){ f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url/history", "GET", data_parse_function = function(x) x$items) f() } googleAuthR::gar_auth(new_user=T) ggg <- gar_batch(list(list_websites(), user_history()))
A common batch task is to walk through the same API call, modifying only one parameter. An example includes walking through Google Analytics API calls by date to avoid sampling.
A function to enable this is implemented at gar_batch_walk
, with an example below:
walkData <- function(ga, ga_pars, start, end){ dates <- as.character( seq(as.Date(start, format="%Y-%m-%d"), as.Date(end, format="%Y-%m-%d"), by=1)) ga_pars$samplingLevel <- "HIGHER_PRECISION" anyBatchSampled <- FALSE samplePercent <- 0 ## this is applied to each batch to keep tally of meta data bf <- function(batch_data){ lapply(batch_data, function(the_data) { if(attr(the_data, 'containsSampledData')) anyBatchSampled <<- TRUE samplePercent <<- samplePercent + attr(the_data, "samplePercent") }) batch_data } ## the walk through batch function. ## In this case both start-date and end-date are set to the date iteration ## if the output is parsed as a dataframe, it also includes a rbind function ## otherwise, it will return a list of lists walked_data <- googleAuthR::gar_batch_walk(ga, dates, gar_pars = ga_pars, pars_walk = c("start-date", "end-date"), batch_function = bf, data_frame_output = TRUE) message("Walked through all dates. Total Results: [", NROW(walked_data), "]") attr(walked_data, "dateRange") <- list(startDate = start, endDate = end) attr(walked_data, "totalResults") <- NROW(walked_data) attr(walked_data, "samplingLevel") <- "HIGHER_PRECISION, WALKED" attr(walked_data, "containsSampledData") <- anyBatchSampled attr(walked_data, "samplePercent") <- samplePercent / length(dates) walked_data }
New in 0.4
is helper functions that use Google's API Discovery service.
This is a meta-API which holds all the necessary details to build a supported Google API, which is all modern Google APIs. At the time of writing this is 152 libraries.
These libraries aren't intended to be submitted to CRAN or used straight away, but should take away a lot of documentation and function building work so you can concentrate on tests, examples and helper functions for your users.
Get a list of the current APIs via gar_discovery_apis_list()
all_apis <- gar_discovery_apis_list()
To get details of a particular API, use its name and version in the gar_discovery_api()
function:
a_api <- gar_discovery_api("urlshortener", "v1")
You can then pass this list to gar_create_package()
along with a folder path to create all the files necessary for an R library. There are arguments to set it up with RStudio project files, do a CRAN CMD check
and upload it to Github.
vision_api <- gar_discovery_api("vision", "v1") gar_create_package(vision_api, "/Users/mark/dev/R/autoGoogleAPI/", rstudio = FALSE, github = FALSE)
A loop to build all the Google libraries is shown below, the results of which is available in this Github repo.
library(googleAuthR) api_df <- gar_discovery_apis_list() api_json_list <- mapply(gar_discovery_api, api_df$name, api_df$version) ## WARNING: this takes a couple of hours check_results <- lapply(api_json_list, gar_create_package, directory = "/Users/mark/dev/R/autoGoogleAPI", github = FALSE)
From version >0.5.1
you can create caches of API calls that can be used within unit tests.
Set option(googleAuthR.mock_tests = TRUE)
in your testing suite. Now API calls will be compared with a cache located within tests/mock/
and if they have a cached call with the same arguments no API call will actually be made. This allows you to perform tests when offline and without needing to deal with authentication issues when publishing online such as Travis.
If a cache is not available whilst the option(googleAuthR.mock_tests = TRUE)
is on, then a cache will be created. You can see which functions are cached using gar_mock_list()
and delete the cache with gar_mock_delete()
You also need to specify which package you are testing with options(googleAuthR.mock_package = "mypackage")
or using gar_setup_mock()
Within testing suites, put the flag at the start of the script:
library(testthat) options(googleAuthR.mock_test = TRUE) options(googleAuthR.cache_package = "mypackage") ...etc.. test_that("I can call this API", { call_api() })
Run this once through with your own local authentication, then once all the caches are built you can upload to GitHub. Be sure you only upload caches you don't mind being public.
The cache is made at the point of when the API call is made, so any changes you make to parsing etc. will affect the results.
You can also set up local caching of API calls. Activate by using the gar_cache_set_loc()
function with either memory
to store in RAM or the name of a folder to store .rds
files, and specifying which function to cache calls from via gar_setup_cache("name_of_package")
.
Once set, if the function call name, arguments and body are the same, it will attempt to find a cache. If it exists it will read from there rather than making the API call. If it does not it will make the API call, and save the response to where you have specified in gar_cache_set_loc()
List the caches active using gar_cache_list()
and remove them by either deleting the files in the cache directory or running gar_cache_delete()
This can be local for large requests that may timeout or 500 error sometimes. Re-running the API call will read from cache for the progress you have made so far, so much speeding up reruns.
Obviously be careful to only use this where you know the API request won't change - if you wnat to refresh the cache you will need to set the caching off or delete the individual cache file in the directory - the hash name of the cache should be consistent.
Below is an example building a link shortner R package using googleAuthR
.
It was done referring to the documentation for Google URL shortener.
Note the help docs specifies the steps outlined above. These are in general the steps for every Google API.
https://www.googleapis.com/urlshortener/v1/url
)POST
library(googleAuthR) ## change the native googleAuthR scopes to the one needed. options("googleAuthR.scopes.selected" = c("https://www.googleapis.com/auth/urlshortener")) #' Shortens a url using goo.gl #' #' @param url URl to shorten with goo.gl #' #' @return a string of the short URL shorten_url <- function(url){ body = list( longUrl = url ) f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "POST", data_parse_function = function(x) x$id) f(the_body = body) } #' Expands a url that has used goo.gl #' #' @param shortUrl Url that was shortened with goo.gl #' #' @return a string of the expanded URL expand_url <- function(shortUrl){ f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "GET", pars_args = list(shortUrl = "shortUrl"), data_parse_function = function(x) x) f(pars_arguments = list(shortUrl = shortUrl)) } #' Get analyitcs of a url that has used goo.gl #' #' @param shortUrl Url that was shortened with goo.gl #' @param timespan The time period for the analytics data #' #' @return a dataframe of the goo.gl Url analytics analytics_url <- function(shortUrl, timespan = c("allTime", "month", "week","day","twoHours")){ timespan <- match.arg(timespan) f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url", "GET", pars_args = list(shortUrl = "shortUrl", projection = "FULL"), data_parse_function = function(x) { a <- x$analytics return(a[timespan][[1]]) }) f(pars_arguments = list(shortUrl = shortUrl)) } #' Get the history of the authenticated user #' #' @return a dataframe of the goo.gl user's history user_history <- function(){ f <- gar_api_generator("https://www.googleapis.com/urlshortener/v1/url/history", "GET", data_parse_function = function(x) x$items) f() }
To use the above functions:
library(googleAuthR) # go through authentication flow gar_auth() s <- shorten_url("http://markedmondson.me") s expand_url(s) analytics_url(s, timespan = "month") user_history()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.