to Arm Treasure Data with sparklyr

Description Usage Arguments Details See Also Examples

View source: R/spark_td.R

Read a Treasure Data table into a Spark DataFrame

1 2	spark_read_td(sc, name, source, options = list(), repartition = 0, memory = TRUE, overwrite = TRUE)

`sc`	A `spark_connection`.
`name`	The name to assign to the newly generated table on Spark.
`source`	Source name of the table on TD. Example: "sample_datasets.www_access"
`options`	A list of strings with additional options.
`repartition`	The number of partitions used to distribute the generated table. Use 0 (the default) to avoid partitioning.
`memory`	Boolean; should the data be loaded eagerly into memory? (That is, should the table be cached?)
`overwrite`	Boolean; overwrite the table with the given name if it already exists?

You can read TD table through td-spark. You have to set spark.td.apikey, spark.serializer appropreately.

Other Spark serialization routines: spark_execute_td_presto, spark_read_td_presto, spark_read_td_query, spark_write_td

## Not run: 
config <- spark_config()

config$spark.td.apikey <- Sys.getenv("TD_API_KEY")
config$spark.serializer <- "org.apache.spark.serializer.KryoSerializer"
config$spark.sql.execution.arrow.enabled <- "true"

sc <- spark_connect(master = "local", config = config)

www_access <-
  spark_read_td(
  sc,
  name = "www_access",
  source = "sample_datasets.www_access")

## End(Not run)