Description Usage Arguments Value References See Also Examples
View source: R/spark_read_bigquery.R
This function reads data stored in a Google BigQuery table.
1 2 3 4 5 6 7 | spark_read_bigquery(sc, name,
billingProjectId = default_billing_project_id(),
projectId = billingProjectId, datasetId = NULL, tableId = NULL,
sqlQuery = NULL, type = default_bigquery_type(),
gcsBucket = default_gcs_bucket(),
serviceAccountKeyFile = default_service_account_key_file(),
additionalParameters = NULL, memory = FALSE, ...)
|
sc |
|
name |
The name to assign to the newly generated table (see also
|
billingProjectId |
Google Cloud Platform project ID for billing purposes.
This is the project on whose behalf to perform BigQuery operations.
Defaults to |
projectId |
Google Cloud Platform project ID of BigQuery dataset.
Defaults to |
datasetId |
Google BigQuery dataset ID (may contain letters, numbers and underscores).
Either both of |
tableId |
Google BigQuery table ID (may contain letters, numbers and underscores).
Either both of |
sqlQuery |
Google BigQuery SQL query. Either both of |
type |
BigQuery import type to use. Options include "direct", "avro",
"json" and "csv". Defaults to |
gcsBucket |
Google Cloud Storage (GCS) bucket to use for storing temporary files.
Temporary files are used when importing through BigQuery load jobs and exporting through
BigQuery extraction jobs (i.e. when using data extracts such as Parquet, Avro, ORC, ...).
The service account specified in |
serviceAccountKeyFile |
Google Cloud service account key file to use for authentication with Google Cloud services. The use of service accounts is highly recommended. Specifically, the service account will be used to interact with BigQuery and Google Cloud Storage (GCS). |
additionalParameters |
Additional spark-bigquery options. See https://github.com/miraisolutions/spark-bigquery for more information. |
memory |
|
... |
Additional arguments passed to |
A tbl_spark
which provides a dplyr
-compatible reference to a
Spark DataFrame.
https://github.com/miraisolutions/spark-bigquery https://cloud.google.com/bigquery/docs/datasets https://cloud.google.com/bigquery/docs/tables https://cloud.google.com/bigquery/docs/reference/standard-sql/ https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-avro https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-json https://cloud.google.com/bigquery/docs/loading-data-cloud-storage-csv https://cloud.google.com/bigquery/pricing https://cloud.google.com/bigquery/docs/dataset-locations https://cloud.google.com/docs/authentication/ https://cloud.google.com/bigquery/docs/authentication/
spark_read_source
, spark_write_bigquery
,
bigquery_defaults
Other Spark serialization routines: spark_write_bigquery
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## Not run:
config <- spark_config()
sc <- spark_connect(master = "local", config = config)
bigquery_defaults(
billingProjectId = "<your_billing_project_id>",
gcsBucket = "<your_gcs_bucket>",
datasetLocation = "US",
serviceAccountKeyFile = "<your_service_account_key_file>",
type = "direct")
# Reading the public shakespeare data table
# https://cloud.google.com/bigquery/public-data/
# https://cloud.google.com/bigquery/sample-tables
shakespeare <-
spark_read_bigquery(
sc,
name = "shakespeare",
projectId = "bigquery-public-data",
datasetId = "samples",
tableId = "shakespeare")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.