This is a package for interating with BigQuery from within R.
You may want instead to use bigrquery which is more developed with integration with dplyr
etc. Some functions from bigrquery
are used in this package.
This package is here as it uses googleAuthR as backend, so has Shiny support, and compatibility with other googleAuthR dependent packages.
It also has support for data extracts to Google Cloud Storage, meaning you can download data and make the download URL available to a user via their Google email. If you do a query normally with over 100000 results it hangs and errors.
Authentication is as used in other googleAuthR
libraries:
library(bigQueryR) ## this will open your browser ## Authenticate with an email that has access to the BigQuery project you need bqr_auth() ## verify under a new user bqr_auth(email="me@work.com")
If you are authenticating under several APIs via googleAuthR
then use gar_auth()
instead with the appropriate scopes set.
You can also use service-to-service JSON files and multi-user authentication under Shiny, see the googleAuthR
readme for details.
Various functions for listing what is in your BigQuery account.
library(bigQueryR) ## this will open your browser ## Authenticate with an email that has access to the BigQuery project you need bqr_auth() ## get projects projects <- bqr_list_projects() my_project <- projects$id[1] ## for first project, get datasets datasets <- bqr_list_datasets(my_project) my_dataset <- datasets$datasetId[1] ## list tables my_tables <- bqr_list_tables(my_project, my_dataset) ## get metadata for table meta_table <- bqr_table_meta(my_project, my_dataset, my_table$tableId[1])
You can pass in queries that have results under ~ 100000 rows using this command:
bqr_query(my_project,"samples", "SELECT COUNT(repository.url) FROM [publicdata:samples.github_nested]")
More than that, and the API starts to hang and you are limited by your download bandwidth.
For bigger queries, asynchronous queries save the results to another BigQuery table. You can check the progress of the job via bqr_get_job
library(bigQueryR) ## Auth with a project that has at least BigQuery and Google Cloud Storage scope bqr_auth() ## make a big query job <- bqr_query_asynch("your_project", "your_dataset", "SELECT * FROM blah LIMIT 9999999", destinationTableId = "bigResultTable") ## poll the job to check its status ## its done when job$status$state == "DONE" bqr_get_job("your_project", job$jobReference$jobId) ##once done, the query results are in "bigResultTable"
You may now want to download this data. For large datasets, this is best done via extracting the BigQuery result to Google Cloud Storage, then downloading the data from there.
You can create a bucket at Google Cloud Storage see https://cloud.google.com/storage/docs/cloud-console, or you can use library(googleCloudStorageR)
Once created, you can extract your data via the below:
## Create the data extract from BigQuery to Cloud Storage job_extract <- bqr_extract_data("your_project", "your_dataset", "bigResultTable", "your_cloud_storage_bucket_name") ## poll the extract job to check its status ## its done when job$status$state == "DONE" bqr_get_job("your_project", job_extract$jobReference$jobId) ## to download via a URL and not logging in via Google Cloud Storage interface: ## Use an email that is Google account enabled ## Requires scopes: ## https://www.googleapis.com/auth/devstorage.full_control ## https://www.googleapis.com/auth/cloud-platform ## set via options("bigQueryR.scopes") and reauthenticate if needed download_url <- bqr_grant_extract_access(job_extract, "your@email.com") ## download_url may be multiple if the data is > 1GB > [1] "https://storage.cloud.google.com/big-query-r-extracts/extract-20160311112410-000000000000.csv" > [2] "https://storage.cloud.google.com/big-query-r-extracts/extract-20160311112410-000000000001.csv" > [3] "https://storage.cloud.google.com/big-query-r-extracts/extract-20160311112410-000000000002.csv"
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.