README.md

bigrquery

CRAN
Status R-CMD-check Codecov test
coverage

The bigrquery package makes it easy to work with data stored in Google BigQuery by allowing you to query BigQuery tables and retrieve metadata about your projects, datasets, tables, and jobs. The bigrquery package provides three levels of abstraction on top of BigQuery:

Installation

The current bigrquery release can be installed from CRAN:

install.packages("bigrquery")

The newest development release can be installed from GitHub:

#install.packages("pak")
pak::pak("r-dbi/bigrquery")

Usage

Low-level API

library(bigrquery)
billing <- bq_test_project() # replace this with your project ID 
sql <- "SELECT year, month, day, weight_pounds FROM `publicdata.samples.natality`"

tb <- bq_project_query(billing, sql)
bq_table_download(tb, n_max = 10)
#> # A tibble: 10 × 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969    10     7          7.56
#>  2  1969     5     9          6.62
#>  3  1969     2     6          2.00
#>  4  1969     1     8          8.44
#>  5  1969     6    23          9.81
#>  6  1969     7    31          7.19
#>  7  1969    11     6          7.50
#>  8  1969    12    19          7.50
#>  9  1969     2    17          7.05
#> 10  1969     5     3          8.50

DBI

library(DBI)

con <- dbConnect(
  bigrquery::bigquery(),
  project = "publicdata",
  dataset = "samples",
  billing = billing
)
con 
#> <BigQueryConnection>
#>   Dataset: publicdata.samples
#>   Billing: gargle-169921

dbListTables(con)
#> [1] "github_nested"   "github_timeline" "gsod"            "natality"       
#> [5] "shakespeare"     "trigrams"        "wikipedia"

dbGetQuery(con, sql, n = 10)
#> # A tibble: 10 × 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  1969    10     7          7.56
#>  2  1969     5     9          6.62
#>  3  1969     2     6          2.00
#>  4  1969     1     8          8.44
#>  5  1969     6    23          9.81
#>  6  1969     7    31          7.19
#>  7  1969    11     6          7.50
#>  8  1969    12    19          7.50
#>  9  1969     2    17          7.05
#> 10  1969     5     3          8.50

dplyr

library(dplyr)

natality <- tbl(con, "natality")
#> Warning: <BigQueryConnection> uses an old dbplyr interface
#> ℹ Please install a newer version of the package or contact the maintainer
#> This warning is displayed once every 8 hours.

natality %>%
  select(year, month, day, weight_pounds) %>% 
  head(10) %>%
  collect()
#> # A tibble: 10 × 4
#>     year month   day weight_pounds
#>    <int> <int> <int>         <dbl>
#>  1  2005     5    NA          7.56
#>  2  2005     6    NA          4.75
#>  3  2005    11    NA          7.37
#>  4  2005     6    NA          7.81
#>  5  2005     5    NA          3.69
#>  6  2005    10    NA          6.95
#>  7  2005    12    NA          8.44
#>  8  2005    10    NA          8.69
#>  9  2005    10    NA          7.63
#> 10  2005     7    NA          8.27

Important details

Authentication and authorization

When using bigrquery interactively, you’ll be prompted to authorize bigrquery in the browser. You’ll be asked if you want to cache tokens for reuse in future sessions. For non-interactive usage, it is preferred to use a service account token, if possible. More places to learn about auth:

Note that bigrquery requests permission to modify your data; but it will never do so unless you explicitly request it (e.g. by calling bq_table_delete() or bq_table_upload()). Our Privacy policy provides more info.

Billing project

If you just want to play around with the BigQuery API, it’s easiest to start with Google’s free sample data. You’ll still need to create a project, but if you’re just playing around, it’s unlikely that you’ll go over the free limit (1 TB of queries / 10 GB of storage).

To create a project:

  1. Open https://console.cloud.google.com/ and create a project. Make a note of the “Project ID” in the “Project info” box.

  2. Click on “APIs & Services”, then “Dashboard” in the left the left menu.

  3. Click on “Enable Apis and Services” at the top of the page, then search for “BigQuery API” and “Cloud storage”.

Use your project ID as the billing project whenever you work with free sample data; and as the project when you work with your own data.

Useful links

Policies

Please note that the ‘bigrquery’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Privacy policy



Try the bigrquery package in your browser

Any scripts or data that you put into this service are public.

bigrquery documentation built on April 20, 2023, 5:14 p.m.