bqr_extract_data: Extract data asynchronously

Description Usage Arguments Value See Also Examples

View source: R/downloadData.R

Description

Use this instead of bqr_query for big datasets. Requires you to make a bucket at https://console.cloud.google.com/storage/browser

Usage

1
2
3
4
5
6
bqr_extract_data(projectId = bqr_get_global_project(),
  datasetId = bqr_get_global_dataset(), tableId, cloudStorageBucket,
  filename = paste0("big-query-extract-", gsub(" |:|-", "", Sys.time()),
  "-*.csv"), compression = c("NONE", "GZIP"),
  destinationFormat = c("CSV", "NEWLINE_DELIMITED_JSON", "AVRO"),
  fieldDelimiter = ",", printHeader = TRUE)

Arguments

projectId

The BigQuery project ID.

datasetId

A datasetId within projectId.

tableId

ID of table you wish to extract.

cloudStorageBucket

URI of the bucket to extract into.

filename

Include a wildcard (*) if extract expected to be > 1GB.

compression

Compression of file.

destinationFormat

Format of file.

fieldDelimiter

fieldDelimiter of file.

printHeader

Whether to include header row.

Value

A Job object to be queried via bqr_get_job

See Also

https://cloud.google.com/bigquery/exporting-data-from-bigquery

Other BigQuery asynch query functions: bqr_download_extract, bqr_get_job, bqr_grant_extract_access, bqr_query_asynch, bqr_wait_for_job

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
## Not run: 
library(bigQueryR)

## Auth with a project that has at least BigQuery and Google Cloud Storage scope
bqr_auth()

## make a big query
job <- bqr_query_asynch("your_project", 
                        "your_dataset",
                        "SELECT * FROM blah LIMIT 9999999", 
                        destinationTableId = "bigResultTable")
                        
## poll the job to check its status
## its done when job$status$state == "DONE"
bqr_get_job("your_project", job)

##once done, the query results are in "bigResultTable"
## extract that table to GoogleCloudStorage:
# Create a bucket at Google Cloud Storage at 
# https://console.cloud.google.com/storage/browser

job_extract <- bqr_extract_data("your_project",
                                "your_dataset",
                                "bigResultTable",
                                "your_cloud_storage_bucket_name")
                                
## poll the extract job to check its status
## its done when job$status$state == "DONE"
bqr_get_job("your_project", job_extract$jobReference$jobId)

You should also see the extract in the Google Cloud Storage bucket
googleCloudStorageR::gcs_list_objects("your_cloud_storage_bucket_name")

## to download via a URL and not logging in via Google Cloud Storage interface:
## Use an email that is Google account enabled
## Requires scopes:
##  https://www.googleapis.com/auth/devstorage.full_control
##  https://www.googleapis.com/auth/cloud-platform

download_url <- bqr_grant_extract_access(job_extract, "your@email.com")

## download_url may be multiple if the data is > 1GB


## End(Not run)

bigQueryR documentation built on Oct. 9, 2019, 5:05 p.m.